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Preface 



The origins of the Asiacrypt series of conferences can be traced back to 1990, 
when the first Auscrypt conference was held, although the name Asiacrypt was 
first used for the 1991 conference in Japan. Starting with Asiacrypt 2000, the 
conference is now one of three annual conferences organized by the Interna- 
tional Association for Cryptologic Research (lACR). The continuing success of 
Asiacrypt is in no small part due to the efforts of the Asiacrypt Steering Com- 
mittee (ASC) and the strong support of the lACR Board of Directors. 

There were 153 papers submitted to Asiacrypt 2001 and 33 of these were 
accepted for inclusion in these proceedings. The authors of every paper, whether 
accepted or not, made a valued contribution to the success of the conference. 
Sending out rejection notifications to so many hard working authors is one of 
the most unpleasant tasks of the Program Chair. 

The review process lasted some 10 weeks and consisted of an initial refere- 
eing phase followed by an extensive discussion period. My heartfelt thanks go 
to all members of the Program Committee who put in extreme amounts of time 
to give their expert analysis and opinions on the submissions. All papers were 
reviewed by at least three committee members; in many cases, particularly for 
those papers submitted by committee members, additional reviews were obtai- 
ned. Specialist reviews were provided by an army of external reviewers without 
whom our decisions would have been much more difficult. A list of their names 
is included overleaf; I hope this is complete, but if there are omissions please be 
assured this was not intentional. My thanks go to all of them. 

In addition to the contributed papers, I was delighted to be able to secure two 
eminent and engaging speakers for the invited talks at the conference. Arjen K. 
Lenstra talked on “Impossible Security: Matching AES Security Using Public 
Key Systems” and Brendan McKay talked on “Debunking the Bible Codes”. 
As is traditional at all lACR conferences, a rump session was held to give the 
opportunity to hear latest results and work in progress on a wide variety of 
topics. I would like to thank Bill Caelli for agreeing to take charge of this event 
with his usual flair. 

The smooth running of Asiacrypt 2001 was engineered by an Organizing 
Committee led by the General Chair, Ed Dawson, and his deputy Mark Looi. 
Other members of the committee were Andrew Clark, Ernest Foo, Betty Hans- 
ford, Lauren May, Christine Orme and Jason Thomas. 

I received sterling advice from many experienced people at all stages of the 
program preparation. Members of the ASC and the lACR board were all very 
supportive. Special mention must go to Tatsuaki Okamoto who acted as Advi- 
sory Member of the committee and provided advice based on his considerable 
experience. I would also like to particularly thank previous chairs of lACR confe- 
rences, Mihir Bellare, Bart Preneel, and Joe Kilian, who got very used to being 
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bothered by me with requests for advice on all kinds of problems I had not 
encountered before and were always prepared to give their insightful opinions. 

Any conference today relies heavily on technology to ease the administrative 
burden. All paper submissions to Asiacrypt 2001 were received electronically 
using the web based submission software which has been provided by Chanathip 
Namprempre. Papers were then seamlessly imported into the review software 
which was kindly provided by COSIC, Katholieke Universiteit Leuven, cour- 
tesy of Bart Preneel. The submission software was supported by COSIC’s Wim 
Moreau, who was extremely helpful in providing advice and bug fixes, and went 
to great lengths to provide extra features in the software at very short notice. 
Installing and maintaining the software at ISRC was Andrew Clark. Andrew 
worked tirelessly to ensure that the web server and review server were (almost) 
always available, provided several additional features to the software, and gene- 
rally worked miracles to solve all the problems I came up with. 

I was assisted in many different ways by numerous other ISRC members. 
Ed Dawson, as General Chair and also as Director of ISRC, provided his sup- 
port throughout. Greg Maitland and Kapali Viswanathan came to my rescue on 
numerous occasions. 

Having seen all the people who contributed to the process of preparing the 
program, it may be deduced that I did very little myself. Nevertheless all those 
late nights and weekends went somewhere and I would like to acknowledge the 
forbearance of my family D + C^, and my colleagues and research students, who 
experienced severe denial of service at many times over the months leading up 
to the conference. 
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Abstract. In 1996, a new cryptosystem called NTRU was introduced, 
related to the hardness of finding short vectors in specihc lattices. 
At Eurocrypt 2001, the NTRU Signature Scheme (NSS), a signature 
scheme apparently related to the same hard problem, was proposed. 
In this paper, we show that the problem on which NSS relies is much 
easier than anticipated, and we describe an attack that allows efficient 
forgery of a signature on any message. Additionally, we demonstrate 
that a transcript of signatures leaks information about the secret key: 
using a correlation attack, it is possible to recover the key from a few 
tens of thousands of signatures. The attacks apply to the recently 
proposed parameter sets NSS251-3-SHA1-1, NSS347-3-SHA1-1, and 
NSS503-3-SHA1-1 in [2]. Following the attacks, NTRU researchers have 
investigated enhanced encoding/verification methods in [11]. 

Keywords: NSS, NTRU, Signature Scheme, Forgery, Transcript Anal- 
ysis, Lattice, Cryptanalysis, Key Recovery, Cyclotomic Integer. 



1 Introduction 

Recently, Hoffstein, Pipher, and Silverman introduced a public-key signature 
scheme called NSS (the “NTRU Signature Scheme”) [9]. This scheme is related to 
the NTRU cryptosystem, which was first introduced at the CRYPTO ‘96 rump 
session. An attack on NTRU was quickly found by Coppersmith and Shamir 
(see [4]), which led the authors to adopt larger parameters, and reformulate the 
underlying hard problem as a lattice problem. The current version of NTRU, 
as published in [6], remains unbroken. NSS is also related to the problem of 
finding short vectors in certain lattices, and is an improvement over an early 
version [7] presented at the CRYPTO 2000 rump session. This version proved 
to be insecure, which the designers observed at an early stage. Ilya Mironov 
[14] made the same observation independently a few months later. Basically, it 

* This work has been partially supported by the French Ministry of Research under 
the RNRT Project “Turbo-Signatures” 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 1-20, 2001. 
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appeared that signatures leaked information about the private key, which allowed 
for statistical attacks. 

To eliminate the disclosed weaknesses, certain adaptations were made, yield- 
ing the scheme described in [9] and [8]. Unfortunately, these signatures still leak 
information about the private key. More precisely, it turns out that correlations 
between certain coefficients in the signature and the private key are sufficient to 
recover the entire public key. 

Moreover, and even more dramatic, is a direct forgery attack which enables 
an adversary to sign arbitrary messages without any knowledge of the private 
key. While the flaw does not rule out potentially secure future revisions, our 
analysis shows that the scheme as presented in [9], [8] and [2] is completely 
insecure. 

This paper is organized as follows. In section 2 we provide background and 
describe NSS in more detail. In section 3 we describe the efficient forgery proce- 
dure. Next, in section 4 we explain how to recover the key by examining valid 
signatures. In section 5 we discuss some revisions suggested by the authors of 
NSS to repair the signature scheme. 

2 Description of NSS 

Here we review some mathematics that underlie NSS, and give a brief description 
of the scheme. We refer readers to [9] and [8] for more detailed information. 



2.1 Background Mathematics 

The key underlying mathematical structure of the scheme is the polynomial ring 

R=Z,[X]/{X^ -1) (1) 

where N and q are integers. In practice, N is prime (e.g., 251) and g is a power 
of 2 (e.g., 128). Elements in R are polynomials of degree (at most) N — 1 and 
with coefficients in the range {—q/2,q/2]. 

Multiplication in this ring is like ordinary polynomial multiplication, but 
subject to the relations = X^ for any k > 0. This means that the co- 
efficient of X’^ in the product a * b of a = ao + aiX aN-iX^~^ and 

b = bo + biX -I- ... -I- b]y-iX^~^ is 

{a*b)k= ^ ttibj. (2) 

i-\-j—kmodN 



The multiplication of two polynomials in R is also called the convolution product 
of the two polynomials. For any polynomial a S i?, it is also convenient to intro- 
duce the convolution matrix of a as follows: Let Ma be the N x N matrix indexed 
by {0, ... ,N — 1}, where the element on position (i,j) is equal to a(j_qmodAr- 
With this representation, the product of a and b can be also expressed as the 
product of the row vector (ag, • ■ • , oat-i) with the matrix Mb. From now on, we 
will freely identify any polynomial with its corresponding row vector. 
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While the ring (1) may seem unnatural at first, it is directly related to the 
ring of integers in the cyclotomic field 

(}{Cn)=Q[X]/{X^-^ + ... + X + 1). (3) 

This field Q(Cat) is a field extention of the rational numbers Q, and has a subring 
of Algebraic Integers, ’L{C,n), analogous to the ordinary integers Z C Q. In fact, 
the set of polynomials p G R with p{l) = 1 is isomorphic to the integers C 

Q(CAf), and the convolution product described above in (2) is simply the ordinary 
multiplication operation in this field. 

This field has been extensively studied and has been proposed for use in 
other cryptographic applications such as factoring and as basis for a public key 
cryptosystem (e.g. [17]), and is likely to appear in further analysis of NTRU 
related cryptosystems. However, further familiarity with this field is not required 
for the rest of this paper. 

2.2 The NSS Signature Scheme 

The public key of NSS consists of a polynomial h of degree N—1, and the private 
key of the scheme consists of two polynomials / and g with “small coefficients” 
such that f*h = g, where the polynomials are elements of i? = Zg[X]/(X'^ — 1), 
and q and N are typically 128 and 251. 

In order to describe the scheme further, additional parameters are needed. 
These parameters include the integer p, which is typically chosen to be 3, and the 
integers df, dg and dm, whose suggested values are respectively 70, 40 and 32. 
The latter parameters are used to define several families of polynomials denoted 
by £{di,d 2 ), a notation that refers to the set of polynomials of degree at most 
N—1 with di coefficients 1, ^2 coefficients —1 and all other coefficients 0. 

Key generation: Two polynomials / and g are defined as 

/ = /o + Pfi 



9 = 90+ P9i 

where /o and go are publicly known small polynomials (typically fo = 1 and 
go = 1 — 2X). The polynomial fi is randomly chosen from C{df,df) and sim- 
ilarly gi is randomly chosen from C{dg,dg). It is required that / be invertible 
(i.e., there exists some f~^ with f * f~^ = 1 mod q). This is true with very 
high probability; in any case the preceding step may be repeated by choosing a 
different polynomial fi. 

Signature generation: To sign a message, one transforms the message to be 
signed into a message representative according to a hash function-based proce- 
dure such as that described in [2]. This message representative is a polynomial 
in L{dm,dm)- The signer first computes 



w = m + w\ + pw 2 
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where w \ , W 2 are two polynomials with small coefficients generated at random 
in a rather complex manner that is described in Appendix B. The signer next 
computes the convolution 



s = f * w mod q 

and outputs the pair {m, s) as the signature of m. 



Signature verification: A signature (m, s) consists of the message m together 
with the polynomial s of degree — 1, with coefficients reduced modulo q. Signa- 
ture verification depends on two further parameters -Dmin and I?max (paper [9] 
suggests -Dmin = 55 and -Dmax = 87, together with the parameters suggested 
above), and upon the concept of Deviation. Given two polynomials, A and B 
of degree N — 1, the deviation Dev(A, B) is the function that counts the num- 
ber of coefficients where {A mod q) modp and {B mod q) modp differ. Here, 
modular reduction computes the coefficients in the interval (— q/ 2 ,< 7 / 2 ] (resp. 
{—p/2, p/2]). If A and B are two random polynomials in the ring ljq\X]/ {X^ — 1) 
and p equals 3, we would expect Dev(A, B) to be about | A « 167, since the 
probability that Ai and Bi differ modulo 3 is about 

To verify a signature, first it is checked that s yf 0. Then the polynomial 
t = s * h (mod q) is computed, and the two conditions 



Dmin < Dev(s, fo*m) <D 



max 



Dmin < Dev(t, go*m) < D 



max 



are checked. If both conditions hold, the signature is accepted as valid. 

The soundness of the scheme follows from technical estimates, which we omit. 
It should be noted that signature generation does not necessarily produce valid 
signatures. With the above parameters, signature verification fails in twenty 
percent of the cases and, when this happens, the signer has to create another 
signature. 



3 Forgery Attacks 

Paper [9] claims that a signature essentially proves possession of the secret trap- 
door. Further, it envisions several potential attacks and concludes that the se- 
curity of the system, with the above parameters, is comparable to RSA with 
1024 bit moduli. We show that an attacker can generate forgeries (with slightly 
fewer than Dmax = 87 deviations) almost as quickly as the signer can generate 
signatures, without any knowledge of the private key. Furthermore, the attacker 
can generate forgeries with substantially fewer than Dmax deviations by using 
lattice reduction. 
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3.1 Basic Forgery Attack: The Principle 

In [9] and [8], NSS and NTRU are described as being based on essentially the 
same hard lattice problem. In fact, the problem underlying NSS is more of an 
error correction problem and, as demonstrated in many papers (see e.g. [16]), 
such problems take much larger dimensions to become hard. 

The attack is very simple, once the perspective has been changed, as just 
indicated. The attacker’s task is to find a pair of polynomials {s,t) that satisfy 
t = s * h (mod q), as well as the deviation requirements: 

55 < Dev(s, fo*m) < 87; 



55 < Dev(t, go * rn) < 87. 

Since s and t have 27V coefficients altogether, and the equation t = s*h (mod q) 
imposes TV linear constraints, the attacker has TV degrees of freedom remaining 
in s and t with which he can try to satisfy the deviation requirements. With 
these TV degrees of freedom, he sets 

Si = ifo * m)i modp 



and 



tj = (go * m)j mod p 



for [TV/2J coefficients of s and |"TV/2] coefficients of T — i.e., he chooses about 
half the coefficients of s and half of t to be non-deviating. The remaining halves 
of s and t are left to chance. Since the chosen half of s (resp. t) has no deviations, 
and the remaining half will probabilistically deviate in about | of the positions, 
overall about | of the coefficients of s (resp. t) will deviate. Since |TV « 84 < 
7?max for (TV, I?max) = (251, 87), this process will usually generate a valid forgery 
after only a few iterations. In general, if p = 3 and T4max > then this attack 
will generate forgeries regardless of the size of TV. 



3.2 Basic Forgery Attack: The Details 

In practice, the attack is slightly more complicated than the above, because it 
is possible that the constraints on s and t are incompatible. In this case, we 
say the attacker is unlucky. To avoid being unlucky, the attacker constrains only 
k < TV/2 coefficients each of s and t. By setting up linear equations based on 
the constraints on t, we obtain a system of k linear equations modulo q over the 
(N — k) free unknowns. The coefficients of the unknowns in this system form a 
k X (N — k) submatrix M of Mh whose coefficients are modulo q integers. We 
make the heuristic assumption that these coefficients are independent random 
bits, when reduced modulo 2. 

Lemma 1. Based on the heuristic assumption, the attacker is unlucky with 
probability at most e = ^jv_ 2 fc 
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Proof of lemma: We show that, with probability at least 1 — e, the columns 
of M mod 2 generate the entire /c-dimensional space over the two-element field. 
If this holds, the system has rank k and it has solutions modulo 2 and modulo 
q as well, since q is a power of 2. Now, for every fc-bit vector x, a column 
vector V is such that the inner product (v,x) is zero with probability 1/2. Since 
there are N — k independent columns, x is orthogonal to all column vectors 
with probability . Since there are 2^ possible values for x, we get that, with 
probability at least 1 — there is no vector orthogonal to all column vectors 

of M mod 2. This means that these column vectors span the entire space. □ 

Setting k = 121, the attacker will be lucky with probability at least 1 — 2“®. 
Assuming he is lucky, the attack now amounts to solving a system of 121 equa- 
tions with 130 unknowns. However, a closer look shows that the matrix corre- 
sponding to this system does not depend on m, provided the attacker keeps the 
same selection of coordinates for his constraints; only the “righthand side” of 
the linear system does. This makes possible standard preprocessing of the linear 
system. To keep things simple, assume that, by suitably reindexing coefficients, 
one has brought the constrained coefficients of s in front and made the con- 
strained coefficients of t the trailing block. Then, the matrix M of the system 
that the attacker has to solve is at the right bottom corner of M^, defined by 
the last k rows of Mh and its last N — k columns. Further relabeling makes the 
last k columns of M an invertible submatrix U. Again U lies at the bottom right 
corner of Mh- Thus, once the attacker precomputes the inverse U~^ of U, he may 
thereafter generate solutions to the linear system by choosing the N — 2k = 9 
middle coordinates of s arbitrarily and obtaining the k last ones by a single mul- 
tiplication by U~^. For k = 121, one readily checks that the obtained solution 
will satisfy the deviations requirement with probability > 1/4, so the attacker 
can expect to obtain the desired forgery after only 4 such multiplications. This 
makes forgery almost as fast as regular signature generation. 

Alternatively, the attacker may search for a solution whose number of devia- 
tions lies closer to the middle of the interval -Dmax), simply by searching 

through the 128® solutions to his linear equations. In a relatively short time, he 
can expect to find a solution (s, t) for which s and t have, for example, only 75 
deviations. 

A computer program written in C confirms the above analysis. Specifically, 
we have carried out the following two experiments: 

1. The first with a public key that we manufactured, corresponding to the 
parameters from [9]. 

2. The second with a public key coming from one challenge from the NTRU 
web site. This challenge is for the encryption scheme. Unfortunately, there is 
no challenge for the signature scheme, but we wished to make it clear that we 
were working without the secret key. The challenge uses N = 263 instead of 
N = 251. We left the other parameters unchanged and observe that raising 
N only makes the forgery slightly more difficult. 

We found forgeries whose distance pairs are respectively (75,74) and (79,79), 
close to the middle of the interval (T>min, ^’max)- 
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3.3 Forgery Attack with Lattice Reduction 

In this section we make use of lattice reduction, a technique to find useful Z bases 
of lattices (discrete subgroups of M”). The celebrated LLL algorithm [13] is one 
of a family of algorithms that find bases containing short vectors in a lattice, 
and has found many uses in cryptology. The contemporary survey [15] provides 
an overview of lattice techniques and [1] provides detailed descriptions of many 
forms of the LLL reduction algorithm. In this paper, we use LLL as a black box 
algorithm to find a vector of short Euclidean norm in a lattice defined by the Z 
span of the rows of a matrix. 

We can strengthen the basic forgery attack described above by supplementing 
it with a lattice reduction technique. We exploit the fact that we have consider- 
able freedom when choosing the constrained coefficients of s and t and make the 
observation that all possible ‘simple forgeries’ differ from a given one by a 2N- 
dimensional vector from an easily defined lattice. In other words, the idea here 
is (1) to generate an initial {s" ,t") using the basic forgery attack, and then (2) 
to correct some of the initial signature’s deviations using lattice reduction. This 
hybrid approach allows us to generate forgeries averaging about 56 deviations in 
a few minutes. 

Let {s" ,t") be the initial signature obtained using the basic forgery attack. 
Since t" = s” * h (mod q), the vector (s",t") is in the lattice generated by the 
rows of the following matrix 



Lcs 



I(N) Mh 

0 qI(n) _ 



where /(jv) denotes the iV-dimensional identity matrix. In the basic forgery at- 
tack, to describe it in a slightly different fashion than previously, we found an 
invertible k x k submatrix U of Mh and then reordered the rows and columns 
of Lcs to obtain 



I(N-k) 0 R S 

0 0 ql^N-k) 0 

0 0 0 

where the invertibility of U made it easy to set the first k (actually, first N — k) 
and last k columns to whatever values we desired, modulo q. So, without loss of 
generality, we assume that in our initial signature {s" ,t"), the first k coefficients 
of s” and last k coefficients of t” are chosen to be non-deviating (understanding 
that since the rows and columns of Lcs were reordered, s" and t" have been 
relabeled) . 

The attacker now would like to find some way of correcting the N—k deviating 
coefficients of s" (resp. t”) without touching the k non-deviating coefficients of 
s" (resp. t"). To this end, the attacker would like to find a set of harmless row 

^ Coppersmith and Shamir introduced this lattice in their attack on NTRU [4]. Since 
that time, the inventors of NTRU have hypothesized that the security of NTRU and 
NSS is related to the apparently hard problem of finding short vectors in this lattice. 
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vectors in the lattice generated by Lcs ,2 that contain zeros in the first k and last 
k positions, so that, for any vector (vs,vt) in this set, the pair (s" + Vs,t” + Vt) 
will still be non-deviating in its first k and last k coefficients, while possibly 
having fewer deviations in its other positions. 

We obtain the set of harmless vectors by making a slight modification to 
Lcs, 2 , obtaining a different lattice basis for the same lattice: 

'liN-k) -V R-VT 0 

0 /(fc) T U 

Lcs,3 = 0 Ql(k) 0 0 , 

0 0 ql(N-k) 0 

0 0 0 ql(k)_ 

where V = SU~^ (mod q). To check that both generated lattices are indeed the 
same, one simply considers a linear combination of the first N rows of Tcs, 2 > cor- 
responding to the sequence of coefficients (oi, • • • ,aAr). Writing the coefficients 
blockwise as (^ 1 ,^ 2 ), we see that exactly the same vector modulo q is obtained 
from the rows of Lcs,s by a linear combination corresponding to {Ai, + 

The result follows. Notice that the rows fc -I- 1 to fV — fc and rows -I- 1 to 2N 
of Lcs ,3 have no nonzero coefficients in the first k or last k positions. We let 
L harmless be the lattice generated by these {2N — 2k) harmless vectors. These 
vectors are clearly linearly independent, so we conclude that the dimension of 
^harmless is exactly (2N 2k). 

Now, how do we use the lattice of harmless vectors to improve upon (s", t”)l 
We will construct a lattice in which short vectors correspond to vectors with 
small deviations. Then we can search for a harmless vector, which, when added 
to {s” ,t") is a very short vector. This problem is an example of a closest vector 
lattice problem (CVP), related to the shortest vector lattice problem (SVP). See 
[15] for some comments on the relationship of the CVP to the SVP. To this end, 
we consider the lattice 

T pRharmless 

(s',P) J ’ 

where (s', t') is the row vector with coefficients modulo pq satisfying s' = s" mod 
q and t' = t" mod q, as well as s' = (/o * m) mod p and t' = (go * rn) mod p 
(again, and hereafter, keeping the relabeling in mind). For any row vector (vs, Vt) 
in this lattice, Vg * h = Vt (mod q). Moreover, Vg and Vt will satisfy one of three 
equations modulo p, depending on the value of the scalar coefficient of (s',P): 

Vg = Vt = 0 mod p, or 

Vg = (/o * m) mod p and Vt = (go * niod p, or 

—Vg = (/o * m) mod p and — vt = (go * rn) mod p . 

If we could find a {vg,vt) with small coefficients — for example, in the range 
(— g/2, q/2] — that does not satisfy the first condition^ Vg = Vt = 0 mod p, then 

^ We observe in practice that this first condition may be avoided empirically with high 
probability via a small modifications of the lattice Lpq. 
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either (vs,Vt) or (—Vs,—Vt) would be a valid forgery having zero deviations. 
Unfortunately, finding a short (vs,vt) appears to be a hard lattice problem that 
cannot be solved in any reasonable time for lattices as large as Lpg. 

So, instead of attempting to reduce Lpg, we select c columns of Lpg, corre- 
sponding to unchosen coefficients of and define L final to be the sub- 

matrix of Lpg consisting of these c columns. The lattice generated by L final 
is only c-dimensional. We then apply lattice reduction to L final, obtaining a 
c-dimensional output vector. Every coefficient of the output vector that falls in 
the interval {—qj2,qj2] is now non-deviating. In general, the expected number 
of deviations for s (resp. t) after this process is {2N — 2k — c) /3 + A/2, where 
A is the expected number of coefficients of the c-dimensional output vector that 
are outside the interval (— g/2, g/2]. 

For concreteness, when attacking the “practical implementation of NSS,” the 
attacker might set k to be 95 and c to be 150 and reduce the resulting lattice using 
a blocksize of 20. The lattice reduction algorithm is completed a few minutes, 
and empirically, the resulting s and t typically each deviate in about 56 positions. 
For NSS to be secure, Umax would, of course, have to be set much lower than 56 
to ensure that the hybrid forgery attack fails with high probability. 

4 Transcript Attacks 

4.1 Description of the Attack 

In this section we show how to recover the private keys / and g by examining 
a transcript of signatures. A transcript consists of some number of pairs (m, s) 
of messages with valid signatures created by the NSS signature algorithm. We 
also obtain t for each message via the relation t = s * h (mod q). The basis of 
the attack is to examine the distributions of the s or t coefficients for a subset of 
messages m. By setting one coefficient of m to a fixed value, the distributions of 
the coefficients of s and t converge to a limiting distribution which depends on a 
chosen coefficient of the secret key / or g. Thus we compare sample distributions 
of s or t to precomputed estimations of the limiting distribution for each possible 
value of / or g’s coefficient. 

As mentioned above, both the NTRU corporation research team and Mironov 
observed that if the averages of these distributions were dependent on the key 
coefficients, the private keys would be extremely rapidly recovered by essentially 
averaging the signatures. This problem was quickly corrected in the following 
version of NSS [8], by altering the signature algorithm to guarantee that the 
average of these distributions would be indeed independent of the private key 
coefficients. However certain s and t distributions do depend on the possible / 
and g coefficient values and are still quite distinct from one another. Comparing 
these distributions to one another or to a precomputed distribution leads to 
an exposure of the private key. One interpretation of the attack is that it is an 
exploitation of information leaked through the higher moments of the signatures. 

The signature of a message s is obtained via an algorithm which chooses rci 
and W 2 according to an intricate algorithm (see [8]), and sets 

s = f * {m + wi + pw 2 )- 
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This algorithm to choose wi and W 2 is described in Appendix B and it is easily 
observed to be constructed so as to avoid the simple averaging attack. 

All of our experiments have used the suggested parameters q = 128, p = 3, 
and N = 251, although the technique is generally applicable. For this parameter 
set, the polynomials Wi and W 2 have approximately 25 and 64 nonzero entries 
each, and m is set to have 32 coefficients equal to 1 and 32 equal to —1. The 
coefficients of s thus depend on the private key /, the message m and the ran- 
domly generated polynomials wi and W 2 - The situation is entirely similar with 
g and t since 

t = g * {m + wi + pw 2 )- 

In order to obtain the coefficient fk, we fix indices io and jo with ig = 
jo -l- k mod N, and examine the distribution of Sig over a transcript of messages 
with rrijg = l. Unraveling the convolution arithmetic, we have 

Sig = ^ fk{rrij + wij + pw 2 j). 

j+k=io 

We note that the quantity Wj = rrij+wij +pw 2 j is nearly (but not exactly, due 
to a quirk of the wi generation) identically distributed for each index j, when 
the distribution is taken over random values of m. We consider Sig to be the sum 
of the random variables Wj, and because / has exactly 140 nonzero entries, Sig is 
nearly a sum of 140 identically distributed random variables drawn from a fixed 
distribution. However, requiring that nijg = 1 (or 0 or —1) distinguishes the 
random variable Wjg from the others. Our observation is that the term fkWjg 
in the sum defining Sig will contribute differently depending on the value of fk- 

Since an explicit calculation of the distribution of Si would necessarily rely 
on the complex formulas for w\ and W 2 , we tested the heuristic reasoning above 
with several numerical experiments. There are many possible variants of this 
approach. For example, one could also set mj = 0 or mj = — 1 for the appro- 
priate coefficient, and thereby extract additional information from a given size 
transcript. We mention here only one key optimization. Although we fixed the 
index to above, in fact every coefficient of m may be potentially used to obtain 
information about each coefficient of /. Namely, for a single message-signature 
pair, examining Si for all indices i such that rrij = 1 and i + j = k speeds up 
the convergence by a factor of 32, since m has 32 coefficients equal to 1. Thus 
we essentially examine the distribution 

Sij = ^ fk{rrij -I- J -I- pw 2 ,j) 

j-\-k=i,rrij — l 

over a large set of transcripts. We performed several computer experiments which 
implemented the above optimized statistical analysis. Our programs, written in 
C, were able to recover the private key with a very high degree of accuracy. 

4.2 Efficiency of the Attack 

To create the estimated background limiting distribution, we simply created 
several million messages, each signed by a different private key, and calculated 
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the distributions of Sk conditional on mj = 1, and fk assuming a particular 
value in the set {—3,0,3}. These statistics were gathered individually for each 
coefficient of fk, but for simplicity of exposition we combine them, and define 
the three probability distributions Fq, F3, F_a, to be the limiting distributions 
of Si given mj = 1, and the prescribed fk value. 

Given a valid transcript of signed messages, for each coefficient index i, the 
sample distribution of Si is formed, and denoted Si. Next Si is compared to each 
of the distributions Fq, F3, F_3, according to some distribution comparison 
method. To do this, we define Si{x) be the probability that Si = x for some 
X mod q. Similarly define Fo^i{x), F^^i{x), and F-^^i{x) to be the respective 
probabilities that Si = x (conditional on the prescribed value of fk and rrij = 1 
for i = j + k). One simple, effective measure useful for distinguishing these 
distributions is defined as 

A(u) = ^ - A^{x)){Si{x) - Ai{x)), 

X 

where Ai{x) is the average of the frequencies (Fo^i{x), F 3 ^i{x), and F_ 3 ^i(x)). 
Thus for each coefficient i of /, we calculate Ai{v) for ue{— 3,0,3}. Next, we 
ordered the values Z\i(3) and Z\i(— 3) and select the smallest 70 values to identify 
the coefficients with / = 3, and / = — 3 respectively. 

There are clearly many other ways in which the distributions could be com- 
pared, for example with the L 2 norm. The convergence obtained with the above 
metric efficiently recovered the key coefficients, and alternative measures were 
only used in subsequent confirming experiments. We briefly note that the first 
coefficient of / has a slightly different distribution than the other indices, but 
this may be easily adjusted for, and is of minimal importance as it is just a single 
index. 

After predicting the private key, we compared it to the actual private key, 
and checked our results. Here we summarize the number of mistakes made for 
several applications of this technique to transcripts of different lengths. 



Signatures 


Trials 


Average Errors 


100,000 


31 


7.3 


300,000 


16 


2.6 


400,000 


5 


1.2 



The incorrectly predicted coefficients all correspond to indices which were 
near the end of the 70 minimal values in the orderings of Ai(3) and 3). 
In fact, in each trial, we identified a subset of 40 such ‘dubious’ indices before 
comparing to the private key, and verified that all of the errors were located at 
such indices. Given this localization of the errors, we conclude that it is feasible 
via direct search to obtain the exact private key given our estimated private key. 

Depending upon the size of the index subset to examine, we estimate that 
it is possible to obtain the exact key via direct search, even if the guess has up 
to 10 errors ^ Thus with our method of examining the s distribution, the key 

® Assuming the 10 errors are so localized, an upper bound on the number of potential 
corrections to / is equal to the binomial coefficient (40, 10), or less than 2'^®. 
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/ may be completely deduced with as little as 100,000 signatures. We note also 
that significant partial information about a key’s values may be used to greatly 
speed up certain lattice attacks, and in particular lattice reduction techniques 
may also be used to correct the estimated keys with a larger error tolerance than 
the brute force search method described above. These optimization techniques 
are not described further in this paper. 

We note that it is likely that examining t rather than s would yield improved 
convergence rates. This conjecture is based on the fact that g is defined to have 
80 nonzero entries rather than 140. We did not test this hypothesis directly in 
the above situation, but rather in the subsequent statistical attack on an NSS 
variant which we now describe. 



4.3 An NSS Variant 

Although the NSS version published in [8] was the subject of our first analysis, 
several variants proposed for the recent EESS standard [2] use a different pri- 
vate key structure. These key structures were proposed to increase the signing 
efficiency. Recall that the key space notation C{d^ d) indicates a polynomial with 
d coefficients equal to 1 and d coefficients equal to —1. In the original version / 
was chosen to be / = 1 -I- 3/i where fi G £(70, 70), and g = 1 — 2x + 3gi where 
51 G £(40,40). 

The optimized key space is formed as follows. / = 1 -I- 3/i * /2 and g = 
l + 2x + 3gi *52, where /i G £(7,7), /2 G £(5,5), 51 G £(5,5), and 52 G £(4,4). 

Because of cancelation or correlation in the product, / and 5 typically contain 
fewer nonzero elements and contain several coefficients equal to 6 or —6. Thus 
while the original scheme has private keys with a known number of coefficients 
that assume values in the set {3, 0, —3}, the new key have differing numbers of 
coefficients which typically assume values in the set {6, 3, 0, —3, —6}. (We ignore 
the first few indices of / and 5 for simplicity). 

At first glance this appears to make the creation of the precomputed limiting 
distributions difficult. However, there are actually very few possible cases to 
consider. For example, a typical 5 has 62 coefficients equal to 3 or —3 and 5 
equal to 6 or —6. The various other possibilities may be tried sequentially, in 
order of probability. Alternatively, we note that it is also true that the limiting 
distributions of s and t distinguish between the key structures with fewer or 
greater numbers of 6 and —6 coefficients very rapidly, without a need to fix 
values of rrij. 

We found that the new private key structures led to even faster convergence. 
Several factors were changed simultaneously in the following experiment. First, 
we analyzed the distribution of t instead of that of s. Secondly, we assumed the 
number of coefficients in 6,-6 and 3,-3 was known, and did not attempt to 
deduce it. Thirdly, we used the £2 norm to compare the distributions. Finally, 
a two-stage algorithm first found the 6 and —6 coefficients (very easily), and 
the remaining indices were ordered by the £2 distances to the precomputed 
distributions. The values of fk were predicted according to this order. We found 
few errors in these predictions, with a smaller number of signatures. 




Cryptanalysis of the NTRU Signature Scheme (NSS) from Eurocrypt 2001 



13 



Signatures 


Trials 


Average Errors 


30,000 


10 


5.6 


50,000 


10 


4.8 


100,000 


5 


1.8 


200,000 


5 


1.0 



As with the standard keys, it is possible to identify a subset of questionable 
indices for which the guess may be in error. Therefore even a direct search is 
feasible to obtain the exact private key. Thus we conclude that this last technique 
would find the exact private key with a transcript of size 30,000. 

Further optimizations are possible. For example, for a hybrid attack one may 
estimate both keys / and g via a method described above, and then assign 
confidence measures to each index. We then assume that the N/2 coefficients 
of / and N/2 coefficients of g that have the highest confidence measures are in 
fact correctly chosen. The remaining coefficients are determined by the relation 
g = f *h asin section 3, and finally we check that the deduced key pair (/, g) is 
correct. Only enough signatures needed to provide half of each of / and g would 
be needed to obtain the exact key. Another promising optimization would be to 
use the value of the message coefficients nij to make an educated guess to the 
values of rrij before they were reduced modulo q, and compare these distributions. 
Refinements of this strategy might reduce the number of signatures to ten or 
twenty thousand. However, in light of our efficient forgery and the fact that the 
NSS scheme has recently been replaced with a revised version, such optimizations 
are not pursued further in this paper. 

5 Countermeasures 

Subsequent to the discovery of these attacks, the authors of NSS began searching 
for a secure revision of the NSS signature scheme. Jeffrey Hoffstein outlined 
several techniques to alter the scheme at Eurocrypt 2001. These modifications 
were formalized shortly thereafter in a technical note on the NTRU web site [11], 
with further improvements in the second draft [12]. 

Shortly after this paper was initially submitted, the authors of NSS settled on 
a revision of NSS, complete with suggested parameter choices. The precise defi- 
nition of the revised scheme may be found in a preliminary standards document 
[2] . Currently, the third draft of this standard is available at the Consortium for 
Efficient Embedded Security web page [3]. 

The revised scheme does indeed appear to resist the attacks described in this 
paper. We do not rigorously define the new scheme here, but only mention the 
revised scheme’s salient features and how they obviate the above attacks. Further 
details may currently be found in technical notes, a preprint, and a standards 
document [11,12,10,3]. 

The following is a partial list of the modifications. 

1. Private Key Generation: In the version of NSS attacked in this paper, 
/ = fo+pfi and g = go+P9i where /o and go are public parameters. In the 
revised scheme, f = u + pfi and g = u + pgi where u is kept private. 
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2. Verification Criteria: Verification is no longer based on the single criterion 
of deviations, but on multiple tests. 

- Norm Conditions: Verify that \p~^{s—m) mod q\ < B and \p~^{t—m) mod 
q\ < B, where B is some bound on the centered norms [11]. 

- Coefficient Distribution Checks: Perform a battery of specific checks (in 
[3]) on the distributions of the coefficients of s and t. 

- Moment Balancing: Optionally, use an alternate method of rci and W 2 
creation, which alters the coefficients to include higher moment balancing. 

These alterations were made to avoid the attacks presented in this paper, and 
therefore seem rather ad hoc. In particular, the verification protocol is strikingly 
lengthy [3], consisting of 17 steps! However the new key component u, norm 
conditions, and distributional criteria do appear to improve the security. 

First, we discuss the new key component u. This is a very clever method of 
masking the combination of m coefficients which determine the distribution of 
Wo values. Without the tool of controlling this distribution via selecting subsets 
of the messages, (say with mj = 1) our transcript analysis can not effectively 
directly obtain distributions which are sensitive to the private key coefficient 
values. Adding u appears to make the distributions very close, even given mil- 
lions of signatures. This renders the key recovery attack much less effective. 
Alternatively, the moment balancing techniques may also be used to make the 
distributions very close to one another. 

Although the new verification protocol is a much less elegant revision than 
the use of u, it appears to serve its purpose of making forgery more difficult. 
The norm conditions relate the forgery problem of revised NSS to a (presum- 
ably hard) closest vector problem; the deviations criterion did not accomplish 
this. Also, the distribution checks appear to screen out forgeries generated by 
the forgery attacks above. However, it is unclear whether these new verification 
criteria are sufficient. It is likely that an attacker could already satisfy the norm 
conditions by simply using our (unmodified) forgery attack with the lattice re- 
duction. Further cryptanalysis may show that it is possible to refine our attack 
to satisfy the distribution checks, as well. 

The authors of NSS give some interesting analysis on how well the new scheme 
resists the attacks presented here [10]. They include a description of the new 
verification checks, a careful distributional analysis of the coefficients of the 
signatures in the new scheme, and a heuristic argument that signature forgery is 
as hard as a closest vector problem, assuming the adversary is given no transcript 
of previous signatures. 

The new scheme is expected to receive renewed scrutiny, and since the key 
generation, signing and verification processes differ substantially, both forgery 
and key recovery techniques should be re-evaluated. 

6 Conclusion 

We wish to mention that our attack does not endanger the NTRU encryption 
scheme. On the other hand, we think that it shows the benefits of the prov- 
able security approach taken by cryptographic research in the last few years. 
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NSS had no security proof at all, not even relative to a precisely described lat- 
tice problem of some form. Lacking such proof, one could not easily argue that 
NSS was immune to potential simple attacks, as demonstrated by the present 
work. Following the attack, NTRU researchers have investigated enhanced en- 
coding/verification methods in [11]. It appears that such methods can offer a 
form of provable security by reducing forgery to solving a well defined lattice 
attack. This rules out the method of section 3. However, such a reduction would 
not apply to an attacker who takes advantage of transcripts of previously ob- 
tained signatures, as in section 4. We believe that the heuristic approach taken 
by NSS designers makes it extremely difficult to prevent such transcript attacks. 
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A An Example of Signature Forgery 

Here we give an example of how to forge signatures using the public key. Let 
parameters be as defined in NSS251-3-SHA1-1 [2]; N = 251, p = 3, q = 128, 
Vm = 32, Dev(jjjjj = 55, DevJj^g^^ = 87, /o = 1, ffo = 1 — 2AT. Let the public key 
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Let the message to be signed be 

0123456789abcdef0123456789abcdef 



0 : 000000001 
2 : 0000000-0 

4: 000 00- 

6: 010000001 
8: 0010-00-0 
a: 001--0000 

c: 00000-000 

e: 010000000 



000-0010000 

00101000000 

000010000 -- 

010001-1000 

00000000100 

00 000100 

000001101-0 

0000001-000 



000 - 110000 -- 
- 000010 -- 00 - 
101000001000 
0 - 000100-000 
0000000000-0 
010000110-00 
- 000-0010000 
1 0 0 0 0 0 1 
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(— denotes the integer —1). We now find an initial signature (s” , t”) by imposing 
fc = 95 constraints on both s and t. For clarity in this example, we impose these 
constraints on the first 95 coefficients of s” and last 95 coefficients of t”. Then, 
from the many possible (s”,t”), we may get s” equal to 
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and t” equal to 
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The pattern of deviations between s” and {fo * m) looks as follows (each star 
denotes a deviation): 
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For f ’ and {go * m) the pattern is: 



At this point s” and t ” have 108 and 98 deviations, respectively. We now apply 
lattice reduction to coefficient positions 95 through 169 in s” and 81 through 
155 in t” (the 75 leftmost coefficients in s” and 75 rightmost coefficients in t ” 
that have not yet been constrained, for a total of 150 columns). For s, we get: 
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The deviation pattern for s is: 



The deviation pattern for t is: 



Thus, we have produced an s and t that have 47 and 54 deviations from (/o * m) 
and {go * m) respectively. These values are indeed even below the suggested 
parameter value of DevJ^^j,, = 55, which shows that our forgeries would pass even 
stricter deviation requirements. 

Obviously the s and t of this example have highly unusual coefficient distri- 
butions modulo q, which the verifier could easily detect, but this need not be the 
case in general. We can make the coefficient distribution of s and t more ordinary 
by 1) constraining random coefficient positions and 2) distributing the values of 
the constrained coefficients of s” and t” more randomly modulo q, rather than 
setting them all equal to —1, 0 or 1. 

B Determination of and W2 

The following pseudocode may also be found the appendix of [8] 



let w2 have 32 +l’s and 32 -I’s 
set win to 0 

compute s = f * (mes + 3 w2) 

compute t = g * (mes + 3 w2) 

reduce s and t modulo q 
reduce s and t modulo p 

//create wl, first try 

for (i=0; i<N ; i++) 

if(s[i] != mes [i] AND t [i] != mes [i] AND s [i] == t [i] ) 

wl [i] = (mes [i] - s [i] ) mod p 

if(s[i] != mes [i] AND t [i] != mes [i] AND s [i] != t [i] ) 

wl [i] = 1 or -1 with 50°/ probability 

loop 
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//create wl, second try 
for (i=0; i<N ; i++) 

if(s[i] != mes [i] AND t [i] == mes [i] ) 

wl [i] = (mes [i] - s[i]) mod p with 1/4 probability 

if(s[i] == mes [i] AND t [i] != mes [i] ) 

wl [i] = (mes [i] - t[i]) mod p with 1/4 probability 

if (wl has more than 25 nonzero coefficients) 
break out of the loop 

loop 

// modify w2 to prevent averaging attack 
f or (i=0; i<N ; i++) 

with probability 1/p, w2 [i] = w2[i] - (mes [i] + wl [i] ) 
w = wl + 3 w2 
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Abstract. At Crypto ’88, Matsumoto, Kato and Imai proposed a pro- 
tocol, known as RSA-Sl, in which a smart card computes an RSA signa- 
ture, with the help of an untrusted powerful server. There exist two kinds 
of attacks against such protocols: passive attacks (where the server does 
not deviate from the protocol) and active attacks (where the server may 
return false values). Pfitzmann and Waidner presented at Eurocrypt ’92 
a passive meet-in-the-middle attack and a few active attacks on RSA- 
Sl. They discussed two simple countermeasures to thwart such attacks: 
renewing the decomposition of the RSA private exponent, and checking 
the signature (in which case a small public exponent must be used). We 
present a new lattice-based provable passive attack on RSA-Sl which 
recovers the factorization of the RSA modulus when a very small public 
exponent is used, for many choices of the parameters. The first counter- 
measure does not prevent this attack because the attack is a one-round 
attack, that is, only a single execution of the protocol is required. In- 
terestingly, Merkle and Werchner recently provided a security proof of 
RSA-Sl against one-round passive attacks in some generic model, even 
for parameters to which our attack provably applies. Thus, our result 
throws doubt on the real significance of security proofs in the generic 
model, at least for server-aided RSA protocols. We also present a simple 
analysis of a multi-round lattice-based passive attack proposed last year 
by Merkle. 

Keywords: Cryptanalysis, RSA signature, Server-aided protocol. Lat- 
tices. 



1 Introduction 

Small units like chip cards or smart cards have the possibility of computing, 
storing and protecting data. Today, many of these cards include fast and se- 
cure coprocessors allowing to quickly perform the expensive operations needed 

* Work supported in part by the RNRT “Turbo-signatures” project of the French 
Ministry of Research. 

** Work supported in part by the Australian Research Council. 
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by public key cryptosystems. However, a large proportion of the cards consists 
of cheap cards with too limited computing power for such tasks. To overcome 
this problem, extensive research has been conducted under the generic name 
“server-aided secret computations” (SASC). In the SASC protocol, the client 
(the smart card) wants to perform a secret computation (for example, RSA sig- 
nature generation) by borrowing the computing power of an untrusted powerful 
server without revealing its secret information. One distinguishes two kinds of 
attacks against such protocols: attacks where the server follows rigorously the 
protocol are called passive attacks, while attacks where the server may return 
false computations are called active attacks. Attacks are called multi-round when 
they require several executions of the protocol between the same parties. 

Most of the SASC protocols proposed for RSA signatures have been shown to 
be either inefficient or insecure (see for instance the two recent examples [13,10]), 
which explains why, to our knowledge, none of these protocols has ever been used 
in practice. Many of these protocols are variants of the protocols RSA-Sl and 
RSA-S2 proposed by Matsumoto, Kato and Imai [8] at Crypto ’88, which use 
a random linear decomposition of the RSA private exponent. At Eurocrypt ’92, 
Pfitzmann and Waidner [15] presented several natural meet-in-the-middle pas- 
sive attacks and some efficient active attacks against RSA-Sl and RSA-S2. To 
prevent such attacks, they discussed two countermeasures which should be used 
together: one is to renew the decomposition of the private exponent at each sig- 
nature, the other is to check the signature before the end of the protocol, which 
is a well-known countermeasure but requires a very small public exponent since 
the check is performed by the card. 

The first countermeasure was effective against the original active attacks 
of [15], but Merkle [10] showed last year at ACM CCS ’00 that the resulting 
scheme was still insecure. Indeed, he presented an efficient lattice-based multi- 
round passive attack, which was successful (in practice) against many choices of 
the parameters. Merkle’s paper [10] included an analysis of the attack, inspired 
by well-known lattice-based methods [5] to solve the subset sum problem. How- 
ever, the analysis was rather technical and not exactly correct (it assumed a 
distribution of the parameters which was not the one induced by the protocol) . 
We present a simple analysis of a slight variant of Merkle’s attack, which en- 
ables to explain experimental results, and to provide provable results for certain 
choices of the parameters. 

The main contribution of this paper is a new lattice-based passive attack 
which recovers the private exponent (like Merkle’s attack), but only in the case 
a very small public exponent is used (which is the second countermeasure). In- 
terestingly, this attack is only one-round in the sense that a single execution 
of the protocol is sufficient, whereas Merkle’s attack is multi-round, requiring 
many signatures produced by the card with the help of the same server. Conse- 
quently, the first countermeasure has no impact on this new attack. And these 
results point out the limits of the generic model, as applied to the security anal- 
ysis of server-aided RSA protocols. Indeed, Merkle and Werchner [11] proved at 
PKC ’98 that the RSA-Sl protocol was secure against one-round passive attacks 
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in the generic model, in the sense that all generic attacks have complexity at 
least that of a square-root attack (better than the meet-in-the-middle attack 
presented by Pfitzmann and Waidner [15]). Roughly speaking, in this context, 
generic attacks (see [11] for a precise definition) do not take advantage of special 
properties of the group used. However, our attack shows that the RSA-Sl scheme 
is not even secure against one-round passive attacks in the standard model of 
computation. In particular, the attack provably works against certain choices of 
the parameters to which the square-root attack cannot apply. Thus, contrary to 
what Merkle and Werchner claimed in [11], the generic model is not appropriate 
for investigating the security of server-aided RSA protocols. 

The rest of the paper is organized as follows. In Section 2, we make a short 
description of the RSA-Sl server-aided protocol and review some useful back- 
ground. We refer to [8,15] for more details. In Section 3, we present our variant of 
Merkle’s lattice-based attack, together with an analysis. In Section 4, we present 
our new lattice-based attack on low-exponent RSA-Sl. 



2 Background 

2.1 The RSA-Sl Server-Aided Protocol 

Let N be an RSA-modulus and let <~p denote the Euler function. Let e and d be 
respectively the RSA public and private exponents: 

ed=l (mod (p{N)). 

For an integer s we denote by [s] the set of integers of the interval [0, s — 1] and 
by [s] the set of integers of the interval [— s -I- 1, s — 1]. 

Let k, I and m be positive integers and let be the set of vectors 

f=(/i,...,/™)e[2^]’” 



with gcd (/i, . . . , /m, = 1 and with 



= ( 1 ) 

i=l 

where wt(/) denotes the Hamming weight, that is, the sum of binary digits of 
an integer / > 0. 

The RSA-Sl server-aided protocol from [8] computes an RSA signature x'^ 
(mod N) with the help of an (untrusted) server in the following way: 

The RSA-Sl Protocol. 

Step 1 The card selects a vector f = (/i, . . . , fm) € Bk,e,m at random accord- 
ingly to any fixed probability distribution. 
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Step 2 The card sends a vector d = {d \, . . . , dm) € [(p{N)]'^ chosen uniformly 
at random from the set of vectors satisfying the congruence 

m 

'^fidt = d {mod (p{N)), (2) 

i=l 

if possible. Otherwise the card returns to Step 1. 

Step 3 The card asks the server to compute and return Zi = x‘^' (mod N), 
i = 1, . ■ ■ ,Tn. 

Step 4 The card computes 

m 

x'^ = Y[z{^ (modiV). 

i=l 

Our description follows the presentation of [10] rather than the one of the original 
paper [8]. For instance, [8] asks that < k instead of (1) but this 

difference is marginal as all our results can easily be adapted to this case. 

For Step 4, the card mainly has two possibilities, due to memory restrictions. 
One is the square-and-multiply method, which requires at most k£ modular 
multiplications and very little memory. The other is the algorithm of [4], which 
enables to compute Oti (mod N) efficiently but requires more memory than 
the square-and-multiply method. When using this algorithm, to optimize the 
choice of the parameters, one should remove the restriction (1) and replace the 
choice fi G [2^] by fi G [h\ where h is some small integer, not necessarily a power 
of 2. The algorithm then requires at most m+h — 3 modular multiplications, and 
the temporary storage of either m or h — 1 elements, according to whether the 
card stores all the m elements zi,. ■ ■ ,Zm, or the ft. — 1 elements tj = Ylfi=j 
1 £ j < ft (which must be computed upon reception of the Zi’s). Other known 
tricks to speed-up the computation of products of exponentiations (see [6] and [9, 
Sect. 14.6]) do not seem to be useful in this context. 

The protocol requires the transfer of approximately 2m log bits. Since the 
bandwidth of a cheap smartcard is typically 9600 bauds, this means that m 
must be restricted to low values. For instance, with a 1024-bit modulus, the 
value m = 50 already represents 10.7 seconds. 

2.2 Passive Attacks on RSA-Sl 

Notice that the protocol is broken as soon as the fi’s are disclosed. Indeed, the 
integer fi^i i® congruent to the RSA private exponent modulo ip{N), and 
therefore enables to sign any message (and this can be checked thanks to the 
public exponent e). And, of course, one may further recover the factorization 
of N in randomized polynomial time, from e ~ ^ which is a non-zero 

multiple of (fi{N) (see for instance [9, Section 8.2.2]). 

The authors of [8] claimed that the only possible passive attack was to ex- 
haustive search the fi’s, which requires roughly C operations where: 
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But obviously, one can devise simple meet-in-the-middle passive attacks. Pfitz- 
mann and Waidner [15] noticed that one could split (/i, . . . , fm) as {gi, , gm) + 
{hi, . . . , hm) where ^ wt(gi) < ^ wt(ft-i) = \k/2 \ , and deduced an attack with 
time and space complexity roughly: 

f m£ \ 

[im)- 

The attack of [15] is however not optimal: the complexity can easily be improved 
using a trick used by Coppersmith [18] in a meet-in-the-middle attack against 
the discrete logarithm problem with low Hamming weight. By choosing random 
subsets of cardinality [m£/2] inside {1, . . . , mi}, one obtains a randomized meet- 
in-middle-attack with time and space complexity roughly: 




Thus, we obtain an attack of complexity roughly the square root VC of that of 
exhaustive search. Therefore in our numerical experiments we mainly consider 
sets of parameters for which C > Note however that even with C « 2^°°, 
the square-root attack is not much practical, due to memory constraints. 

In [11], Merkle and Werchner proposed an adaptation of generic algorithms 
(see [17]) to server-aided RSA protocols, and showed that any one-round passive 
generic attack on RSA-Sl had complexity at least l7(-\/C). 

In [15], Pfitzmann and Waidner also presented a few active attacks which 
cannot be avoided by increasing the parameters contrary to the passive attacks 
mentioned previously. They discussed two countermeasures to prevent their own 
active attacks: 

• Renewing the decomposition of the private exponent d at each execution of 
the protocol, as described in Steps 1 and 2. 

• Verifying the signature (mod iV) before releasing it, by computing (x'^)® 
(mod N) and checking that it is equal to x. This countermeasure is well- 
known and requires a very small public exponent e (otherwise there is no 
computational advantage in using the server to compute x®* (mod N)). 

The second countermeasure seems necessary but is not sufficient to prevent one 
of the active attacks of [15], and it creates the attack of Section 4. The first 
countermeasure prevents all the active attacks of [15], but creates the passive 
attack of Merkle [10], which we analyze in Section 3. Interestingly, it seems that 
the attacks of Section 3 and 4 do not apply to the RSA-S2 protocol, which is 
a CRT variant of RSA-Sl (see [8,15]). The situation is reminiscent of that of 
RSA with small private exponent, in which the best attack known [3] fails if the 
private exponent is small modulo both p — 1 and q — 1. 

2.3 Lattices 

Our attacks are based on lattice basis reduction, a familiar tool in public-key 
cryptanalysis. We give a brief overview of lattice theory (see the survey [14] for a 
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list of references). In this paper, we call a lattice any subgroup of (Z”, +): in the 
literature, these are called integer lattices. For any set of vectors bi, . . . , G Z”, 
we define the set of all integral linear combinations: 

L(bi, . . . , bd) = ^ mhi : ni e Z 

U=i 

By definition, i(bi, . . . ,bd) is a lattice, called the lattice spanned by the vec- 
tors bi, . . . ,hd- A basis of a lattice L is a set of linearly independent vectors 
bi, . . . , bd such that: 

L = L(bi, . . . ,bd). 

In any lattice, there is always at least one basis, and in general, there are in 
fact infinitely many lattice bases. But all the bases of a lattice L have the same 
number of elements, called the rank or dimension of the lattice. All the bases also 
have the same d-dimensional volume, which is by definition the square root of the 
determinant deti<ij<d(bi, bj), where (,) denotes the Euclidean inner product. 
This volume vol(L) is called the volume or determinant of the lattice. When the 
lattice dimension d is equal to the space dimension n, this volume is simply the 
absolute value of the determinant of any lattice basis. 

For a vector a, we denote by ||a|| its Euclidean norm. A basic problem in 
lattice theory is the shortest vector problem (SVP): given a basis of a lattice L, 
find a non-zero vector v G L such that |jv|| is minimal among all non-zero lattice 
vectors. Any such vector is called a shortest lattice vector. It is well-known that 
the Euclidean norm of a shortest lattice vector is always less than •\/(ivol(L)^/‘^, 
d denoting the lattice dimension. In “usual” lattices, one does not expect the 
norm of a shortest lattice vector to be much less than this upper bound. 

Many attacks in public-key cryptanalysis work by reduction to SVP, or to 
approximating SVP (see the survey [14]). The shortest vector problem was re- 
cently shown to be NP-hard under randomized reductions [1], and therefore, 
it is now widely believed that there is no polynomial-time algorithm to solve 
SVP. However, there exist polynomial-time algorithms which can provably ap- 
proximate SVP. The first algorithm of that kind was the celebrated LLL lattice 
basis reduction algorithm of Lenstra, Lenstra and Lovasz [7]. We use the best 
deterministic polynomial-time algorithm currently known to approximate SVP, 
which is due to Schnorr [16] and is based on LLL: 

Lemma 1. There exists a deterministic polynomial time algorithm which, given 
as input a basis of an s-dimensional lattice L, outputs a non-zero lattice vector 
u G L such that: 

Hull < 20("'°s=^'°s*/^°8®)min{l|z]j : zeL,zy^0}. 

Recently, Ajtai et al. [2] discovered a randomized algorithm which slightly im- 
proves the approximation factor 2^(8 log logs/ log prac- 

tice, the best algorithm to approximate SVP is a heuristic variant of Schnorr’s 
algorithm [16]. Interestingly, these algorithms typically perform much better 
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than theoretically expected: they often return a shortest lattice vector, provided 
that the lattice dimension is not too large. Hence, it is useful to predict what 
can be achieved efficiently if an SVP-oracle (that is, an algorithm which solves 
SVP) is available. For instance, this was done for the subset sum problem [5]. 
However, unless the lattice dimension is extremely small, it is hard to predict 
beforehand whether an SVP-instance is solvable in practice, which means that 
experiments are always necessary in this case. 



3 An Analysis of Merkle’s Multi-round Attack 

3.1 Merkle’s Attack 

The attack of Merkle [10] is based on the following observation: Because for each 
f = (/i, . . . , /m) G and d = (di, . . . , d„) e [(p{N)]"" 

m 

0 < ^ < k2^^{N) 



we have 

m 

^ = d + j^{N) 

with j G [fc2^] , that is, j cannot take too many distinct values. 

It is shown in [10] that regardless of the distribution of the vectors f G 
Bk,£,m with probability at least l/fc 2 ^ for two pairs fi = (/i,... ,fm), di = 
(di, . . . , dm), and £2 = {fm+i, ■■■ , /2m), d 2 = (d„+i, . . . , d2m) of vectors pro- 
duced by the above protocol we have the following equation (over the integers 
rather than modulo N): 



m 2m 

'^hdi= ^ f^d^. (3) 

i—1 i—m-\-l 

In fact, any rule to select the above vectors gives rise to a collision after at most 
k2^ executions of the protocol. Besides, the “birthday paradox” suggests that a 
collision is likely to happen after roughly fc^/^2^/^ executions of the protocol. 

The linear equation (3) is unusual because each fi is small (compared to the 
di’s), and this can be interpreted in terms of lattices. More precisely, it is argued 
in [10] that (fi,f 2 ) is the shortest vector in a particular lattice related to the 
homogeneous equation (3) and the congruences 

m 2m 

'^fidi= ^ fidi = d (mod v?(fV)). (4) 

2=1 i—m-\-l 

However, the analysis presented by Merkle is not sufficient, because it assumes 
a distribution of the parameters which is not the one of the protocol (see [10, 
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Theorem 2.1]). And no result is proposed without SVP-oracles. Hence, Merkle’s 
attack, as presented in [10], is not a proved attack, even under the assumption 
of an SVP-oracle, which is not so unusual for a lattice-based attack. Neverthe- 
less, the experiments conducted by Merkle (see [10]) showed that the attack 
was successful in practice against many choices of the parameters. Thus, it was 
interesting to see whether Merkle’s attack could be proved, with or without 
SVP-oracles. Here, we provide a proof, for a slight variant of Merkle’s attack. 
The analysis we present can in fact be extended to the original attack, but our 
variant is slightly simpler to describe and to analyze, while the difference of 
efficiency between the two attacks is marginal. 



3.2 A Variant of Merkle’s Attack 

We work directly with the lattice corresponding to (3): Let £(di,d 2 ) be the 
(2m — 1 (-dimensional lattice formed by all vectors z G 1?™ with 



m 2m 

'^Zidi= ^ Zid^. 
i—1 i—m-\-l 



This lattice is the simplest case of an orthogonal lattice (as introduced in [12]), 
and one can compute a basis of such lattices in polynomial time. It can easily 
be showed that the volume of the lattice is given by: 



vol(£ (di,d 2 )) 



-I- . . . -I- dljn) ^ 
gcd(di, . . . ,d 2 m) 



Thus, one would expect its shortest non-zero vector to have a norm around: 
(2m- l)^/\ol(£ (di,d 2 ))^^^^'"”^^ « (2m - 



On the other hand, the vector f = (/i,...,/ 2 m) belongs to this lattice, and 
has a norm of at most fc^/^2^. Hence, if is much smaller than (2m — 

l)l/2(^(N)l/(2m-l), 

we expect f to be the shortest vector of C (di, d2), and if it 
is smaller enough, then the gap between f and the other lattice vectors guarantees 
that the algorithm of Lemma 1 will find it. Once f is known, one can derive the 
value /i^i) which is congruent to the RSA private exponent modulo ^p{N), 
and therefore enables to sign any message. And one may further recover the 
factorization of N in randomized polynomial time, from fidi — 1 which 

is a non-zero multiple of ip{N) (see for instance [9, Section 8.2.2]). 

In [10], the original attack of Merkle worked with a slight variant of the lattice 
£ (di, d2), to take advantage of the fact that fi G [2^] and not fi G [2^] Such 
a trick was used for the subset sum problem [5]. However, this trick is not as 
useful here, because the distributions are different. This means that the difference 
between our variant and the original attack is marginal. 
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3.3 Theoretical Results 

The previous reasoning can in fact be made rigorous by a tight analysis, which 
gives rise to the following result: 

Theorem 1. There is a deterministic algorithm A which, given as input an 
RSA modulus N, together with a public exponent e, and a set T> of k2^ vectors 
d € corresponding to a certain set T of vectors f G generated by 

k2^ independent executions o/RSA-Sl, outputs a value A{T>) in time polynomial 
in fc, 2^, TO, log fV such that: 

j^rn-\-2n2£{rn-\-2)-\-0{rn^ log^ log m/ log m) 

Pr [A{V) = d (mod > 1 

where the probability is taken over all random choices of V for the given T . 

Proof. Given a set V of k2^ vectors d associated with the protocol RSA-Sl, 
which corresponds to a certain set if of k2^ (unknown) vectors f G the 

algorithm A selects all possible pairs of such vectors di and d 2 and uses the 
algorithm of Lemma 1 to find a short vector u in the (2 to — l)-dimensional 
lattice C (di,d 2 ) formed by all vectors z G such that 

m 2m 

'^Zidi= ^ Zid^. 

We know that there is at least one pair (di, d 2 ) such that the equation (3) holds. 
Notice that for any f G Bk,e.,m, we have 

m m 

ll/f = (5) 

i=l i=l 

Thus, if we apply the algorithm of Lemma 1 to T(di,d 2 ), we obtain a vector 
u = (mi, . . . , U 2 m) such that: 

||^||2 < 20(™i°s"i°sWiogm)jnin{||zf , z G £(di,d 2 )} 

^ 20 ("ilog^logm/logm) ^||f^|j2 _|_ ||f 2 |p) 

Therefore, there exists some integer U = fci/22^+0(miog^iogm/iog7n) that 
\ui\ < [/ for f = 1, . . . ,2m, that is, u G [U]"^. 

We write u = (ui,U 2 ) where Ui,U 2 G [G]™ and say that u is similar to the 
concatenation (fi, £ 2 ) if either Ui is non-zero and parallel to fi, or U 2 is non-zero 
and parallel to £ 2 - Notice that if one knows a vector u yf 0 similar to £ 1 , £ 2 , one 
obtains at most 2^ possible values for either £1 or £ 2 . And if £1 or £2 is correct, 
then (£i,di) or (£ 2 ,d 2 ) is congruent to d modulo if{N), which can be checked 
by signing a message. Hence it is enough to show that with probability at least 
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1 - /c™+222^(™+2)+0('«"iog"iogWiog™)(p(iv)-i the vector u = (ui,U2) returned 
by the algorithm of Lemma 1 is similar to (fi,f2). 

First for fi,f2 G Bk,(,m we estimate the size of the set £ (fi,f2) of pairs of 
vectors di,d2 G [(p{N)]"^ such that for some u = (ui,U2) G [U]^ which is not 
similar to (fi,f2) we have the equation 



m 2m 

'^Uidi= ^ Uidi. (6) 

i—1 i—m-\-l 

Let us fix a nonzero vector u = (ui, U2) G [U]^ and a vector (fi, f2) G Bk,e,m 
which are not similar. Without loss of generality we may assume that U2 yf 0 and 
is not parallel to {2 and that /2m 7^ 0. Then excluding d 2 m from (6) using (3), 
we obtain an equation 



m 2 m— 1 

^ ^ Cidi — ^ ^ C-idi 

i—1 i—m-\-l 



(7) 



with Ci = Ui~ fiU 2 m/ f 2 mi * = 1, • ■ • , 2m— 1. By our assumption, for at least one 
i > m+ 1, the coefficient Cj yf 0. Without loss of generality we may assume that 
C2m-i y^ 0. Then the first congruence in (4) gives us at most 2^ possible 
values for di = (di, . . . ,dm)- Indeed, assuming that fm ^ ^ and selecting the 
integers d \, . . . , dm-i G [v?(IV)] arbitrarily, we obtain a congruence of the form 
fmdm = D (mod f{N)) which has at most gcd{fm,'^{N)) < jm<2^ solutions 
dm € [v3(fV)]. Finally, for any of possible choices of dm+i, ■ ■ ■ , d2m-2 G 

the equation (7) gives at most one value for dm-i and then the second 
congruence in (4) gives us at most gcd(/2m> 7’(-^)) < /2m < 2^ possible values 
for d 2 m- So the total number of solutions for such u is at most 2‘^^(p{N)‘^"^~^ . 
The total number of such vectors is at most C/^™. Thus we finally derive 

#£ (fi,f2) < (2{7)^™22V(iV)2™-3 

< logm/ logm) /^-\2m-3 



For each vector f G Bkj^m there are exactly vectors d G [(f{N)]"^ 

satisfying the congruence (2). Therefore, the probability that there is a pair 
of vectors fi,f2 G F such that the corresponding vectors di,d2 G V satisfy 
di, d2 G £ (fi, 12) is at most 



^,^jr^2^m22^(m+l)+0(m^ log^ logm/ log m) j^^^^ 2 m -3 



^(^)2m-2 

= )^™+222^(m+2)+0(m^ log^ log m/ log ™) 



and the result follows. 



□ 



Assuming that an SVP-oracle is available, we derive much stronger estimates. 
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Theorem 2. There is a deterministic algorithm A which, given an access to an 
SVP-oracle and as input an RSA modulus N , together with a public exponent e, 
a set V of k2^ vectors d G corresponding to a certain set T of vectors 

f G Bk,£,m generated by k2^ independent executions o/RSA-Sl, outputs a value 
A{T>) in time polynomial in k,2^,m,logN such that: 

jLm+222(£m+2^+m) 

Pr [A{V) = d (mod (^(fV))] > 1 

where the probability is taken over all random choices of V for the given T . 

As in [10], instead of waiting for k2^ executions of RSA-Sl one may also 
restrict to only two executions, which yields the following version of Theorems 1 
and 2: 

Theorem 3. There is a deterministic algorithm A which, given as input an 
RSA modulus N, together with a public exponent e, a pair of vectors di,d2 G 
[V 3 (AI)]’” corresponding to a pair of vectors fi,f2 G generated by two in- 

dependent executions o/ RSA-Sl, outputs a value ^(di,d2) in time polynomial 
in k,2^,m,logN such that: 

1 iLmo 2 ^(m+l)+ 0 (m^ log^ log m/ log m) 

(mod v(JV))] > ^ 

where the probability is taken over all random choices o/di,d2 for the given 

Theorem 4. There is a deterministic algorithm A which, given access to an 
SVP-oracle and as input an RSA modulus N, together with a public exponent e, a 
pair of vectors di, d2 G [v 5 (Af)j™ corresponding to a pair of vectors fi, f2 G 
generated by two independent executions o/ RSA-Sl, makes a single call to the 
SVP-oracle with the lattice £(di,d2) and outputs a value A(di,d2) in time 
polynomial in k,2^,m,logN such that: 

1 ^mo 2 (^m+^+m) 

^Pr_M(d„d,)sd (mod ,=(JV))| > ^ 

where the probability is taken over all random choices o/di,d2 for the given 

Notice that unless k (and thus i > k/m) is exponentially large compared to 
m, which is completely impractical, the terms and fc™ in the bounds of 

Theorems 1 and 3 respectively, can be included in the term log m/ log m)^ 

3.4 Experiments 

In practice, the attack is as efficient as Merkle’s original attack, due to the 
fact that strong lattice basis reduction algorithms behave like oracles for the 
shortest vector problem up to moderate dimension. In [10], Merkle reported the 
experimental results presented in Table 1. Notice however that none of the sets 
of parameters of Table 1 leads to an efficient protocol (for the card) . 
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Table 1. Experiments with Merkle’s attack 



m 


k 


1 


Success (%) 


Complexity of the sqrt attack 


25 


28 


11 


100 


2^52 


32 


26 


10 


100 


262 


38 


26 


9 


100 


263 


42 


26 


8 


100 


263 


48 


26 


7 


70 


263 


56 


26 


6 


10 


263 



4 A New One-Round Attack on Low Exponent RSA-Sl 

4.1 Description of the Attack 

We now assume that a very small public exponent e is used. We also assume 
that the secret primes p and q defining N = pq have approximately the same 
length. Let s = p + q = We have p{N) = N — s+l. When the RSA-Sl 

protocol is performed once, we have: 

m 

'^f,d, = d {modp{N)), 

and therefore, 

m 

fiedi = 1 (mod (p{N)). 

From (5) we see that there exists r G [fc2^e] such that 

m 

Jicdi = 1-1- r(f{N) = 1-1- r{N — s -I- 1). 

i=l 

Hence 

m 

fisdi = I + r — rs (mod A), (8) 

i=l 

where |1 -I- r — rs| = 0{k2^eN^^^). We thus obtain a linear equation modulo N 
where the unknown coefficients fi and 1 -|- r — rs are all relatively small. This 
suggests to define the (m + 1) -dimensional lattice £e,N (d) spanned by the rows 
of the following matrix: 

/ N 0 0 ... 0 \ 

edi eR 0 ... 0 

6^2 0 ei? ■ . : 

: : ■■. ■•. 0 

y edm 0 . . . 0 eR / 
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where R = . Obviously, the volume of this lattice is vol(£e,Ar (d)) = 

Therefore, one would expect its shortest vector to be of norm roughly 
(m -I- i)i/2gm/(”j.-i-i)jY(™+2)/(2m+2)^ Qjj Other hand, the lattice contains the 
target vector 

t = (1 -I- r - rs, fieR, fmeR), 

whose norm is |jt|| = O because of (5). Hence, the target vector is 

likely to be the shortest vector in this lattice if is much smaller 

than Note that this condition is satisfied for sufficiently large N 

and that it is very similar to the heuristic condition we obtained in Section 3.2, 
which suggests that the efficiency of the attacks of Section 4 and 3 should be 
comparable. In case the target vector is really much smaller than the other 
lattice vectors, then the algorithm of Lemma 1 finds it. Once the target vector is 
known, we can recover a private exponent equivalent to d thanks to 
which enables to sign any message, as in Merkle’s attack. Again, one may further 
derive a not too large multiple of which yields the factorization of N in 

randomized polynomial time. 

4.2 Theoretical Results 

The previous attack can be proved, using the same counting arguments of the 
proof of Theorem 1 : 

Theorem 5. There is a deterministic algorithm A which, given as input an 
RSA modulus N = pq such that p + q = together with a public expo- 

nent e, and a vector d € [v3(fV)]’” corresponding to a certain vector f G Bk/,m 
generated by RSA-Sl, outputs a value A(d) in time polynomial in k, 2^, m, log N 
such that: 

Lm+1 ^m+1 log^ log m/ log m) 

Pr [A(d) = d (mod (fi{N))] > 1 

where the probability is taken over all random choices of d for the given f. 

Proof. The algorithm A starts by applying the algorithm of Lemma 1 to find 
a short vector w 0 in the (m -I- 1 (-dimensional lattice Ce,N (d). Since t is a 
lattice vector and because p-\- q = 0(7V^/^), we have: 

||w|| < 

By definition of the lattice, w is of the form: 

m 

w = {uqN ediUi, u\eR , . . . , Um^R), 

i=l 

where each Ui is an integer. 

Therefore, there exists some integer U = log m/ log m) 

luil < U for t = 1,... ,2m. Thus u = {m,... ,Um) G We may assume 
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that ||w|| < N otherwise the right hand side of the inequality of the theorem is 
negative, making the bound trivial. Then necessarily u yf 0. We also have 

m 

ediUi = wq (mod N) (9) 

i=l 

for some wq € [W]^ where W = O ^°s’n/iogm)jYi/ 2 'j ^ 

Clearly, we may assume that 2^ < min{p, q} otherwise the result is trivial. 
Thus for any i = 1, . . . ,m with yf 0 we have gcd(/i, N) = 1. As before we 
see that for each wq and for each u G [Cf]± not parallel to f there are at most 
vectors d G [(p{N)\^ satisfying both (8) and (9). Therefore the total 
number of vectors d G [(f{N)Y^ which satisfy (8) and at least one congruence (9), 
for some Wq G [W]^ and some nonzero vector u G [C/]™ not parallel to f, is at 
most 

_ ^m+lgm+ 12 ^ (m+l)+0(m^ log^ log m/ log m) jy 

Taking into account that ip{N) > N/2 we obtain the desired result. □ 

Of course, the same proof provides a stronger result if an SVP-oracle is available: 

Theorem 6. There is a deterministic algorithm A which, given access to an 
SVP-oracle and as input an RSA modulus N = pq such that p q = 
together with a public exponent e, vector d G [(p(N)Y^ corresponding to a cer- 
tain vector f G generated by RSA-Sl, makes a single call to the SVP- 

oracle with the lattice Ce,N (d) and outputs a value A(d) in time polynomial in 
fc, 2^, TO, log iV such that: 

Lm+1 m+l9^(m+l)+0(m) 

Pr [A(d) = d (mod i^(iV))] > 1 

where the probability is taken over all random choices of d for the given f. 

Certainly one can obtain similar results when the primes p and q are not 
balanced, although the probability of success decreases. 

4.3 Experiments 

We made a few experiments with a (balanced) 1024-bit RSA modulus and a 
public exponent e = 3, using Victor Shoup’s NTL library [19]. The experiments 
have confirmed the heuristic condition. By applying standard floating point LLL 
reduction, and improved reduction if necessary, we have been able to recover 
the private exponent for all the parameters considered by Merkle in his own 
experiments [10] (see Table 1). The success rate has been 100%, except with the 
case {m,k,£) = (56,26,6) where it is 65% (for this case, Merkle only achieved 
a 10% success rate). We also made some experiments on other (more realistic) 
sets of parameters. For instance, over 100 samples, we have always been able 
to recover the factorization with {m,k,tj = (60,30,3), (70,30,2) and (80,40, 1). 
The attack takes at most a couple of minutes, as the lattice dimension is only 
TO-l-1. These results show that no set of parameters for RSA-Sl provides sufficient 
security without being impractical for the card. 
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Abstract. We study a class of problems called Modular Inverse Hidden 
Number Problems (MIHNPs). The basic problem in this class is the 
following: Given many pairs (^Xi, msba, + modp)) for random 

Xi e Zp the problem is to hnd a G Zp (here MSBfc(a;) refers to the k 
most significant bits of x). We describe an algorithm for this problem 
when k > (log2P)/3 and conjecture that the problem is hard whenever 
k < (log2p)/3. We show that assuming hardness of some variants of 
this MIHNP problem leads to very efficient algebraic PRNGs and MACs. 

Keywords: Hidden number problems, PRNG, MAC, Approximations, 
Modular inversion. Lattices, Coppersmith’s attack 



1 Introduction 

In recent years several new complexity assumptions were used to construct effi- 
cient cryptosystems. The Decision Diffie-Hellman assumption (DDH) was used 
to construct chosen ciphertext secure encryption [7] and number theoretic pseudo 
random functions [15]. The Strong RSA assumption was used to construct effi- 
cient signature schemes [10,8]. In this paper we introduce a new class of alge- 
braic complexity assumptions which we call the Modular Inverse Hidden Number 
Problem (MIHNP). Using MIHNP we construct an efficient number theoretic 
Pseudo Random Number Generator (PRNG) and an efficient MAG. The basic 
step in evaluating the MAG and the PRNG is one modular inversion modulo a 
moderate size prime. No expensive exponentiations are needed. 

To describe the basic MIHNP we introduce the following notation that will be 
used throughout the paper: For an m-bit prime p and j/ S Zp we use MSBfc(y mod 
p) to denote any integer Y G Zp satisfying \Y — y\ < p/2^. In other words, Y is 
an approximation to y that (usually) matches y on the k most significant bits. 
We write MSBfc(y) where there is no ambiguity about the modulus p. In addition, 
throughout the paper we define the inverse of 0 G Zp to be 0. We consistently 
use Greek characters to denote hidden values. 

MIHNP. An instance of the basic MIHNP problem is as follows: let p be a fixed 
m-bit prime and k, n be positive integers. Let a be a random hidden element 
of Zp. We are given p,k, and (^Xi, MSBfc( ^_j[^ )^ for random values . . . ,Xn- 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 36-51, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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The problem is to find a. The (5-MIHNP assumption states that there is no 
polynomial time algorithm for the Basic-MIHNP problem whenever k < 6m. 

In other words, given many approximations to (a + Xi)~^ mod p for random 
Xi G Zp the problem is to find a. The parameters m, n, k are security parameters 
for the problem. Note that when n > 2(m/fc) the hidden number a. is uniquely 
defined with high probability and consequently there is a unique answer to this 
problem. We show a lattice-based algorithm, that solves this problem when k > 
m/3. We also explain why this algorithm does not extend to solve it for k < 
m/3. As our algorithm represents the current state-of-the-art in lattice reduction 
techniques, we conjecture that such techniques cannot be used beyond the m/3 
bound. More generally, we conjecture that the i5-MIHNP assumption holds for 
any 6 < 1/3. In the next section we introduce several variants of MIHNP that are 
useful for cryptographic constructions. We also show that the MIHNP problem 
has a simple limited random self reduction. 

MIHNP is closely related to several other Hidden Number Problems (HNPs). 
Hidden number problems were introduced in [4] where they were used to prove 
the bit security of the Diffie-Hellman secret in Zp. The standard HNP is as fol- 
lows: let a € Zp be a hidden random number. Given MSBfc(a • Xi mod p) for 
random xi,. . . ,Xn G Zp the problem is to find a. The standard, HNP can be 
efficiently solved when k = 0{^J\p\), and this solution forms the basis of the 
bit-security result in [4] (as well as an attack on weak versions of the Digital Sig- 
nature Algorithm (DSA), see [13]). This is in contrast to MIHNP which appears 
to be hard even when A: is a constant fraction of jpj. 

2 Approximate Modular Inversion Problems 

We introduce several variants of the basic MIHNP and study their properties. 
The first variant of MIHNP, which we call the Computational-MIHNP, is useful 
for constructing a MAC. 

Computational-MIHNP: An instance of the C-MIHNP problem is as follows: 
let p be a fixed m-bit prime and k,n be positive integers. Let a be a random 
hidden element of Zp. We are given p, k, and (^Xi, MSBfc( ^^^ for random values 

xi, . . . ,Xn- The problem is to construct another pair MSBfc(^^^)^ for some 
X Xi- The (j-CMIHNP assumption states that there is no polynomial time 
algorithm for this problem whenever k < 5m. 

Although we cannot prove the equivalence of this problem to the basic MI- 
HNP, we do not know of an algorithm for solving it without first discovering the 
secret a from the given input. The second variant, which we call the Decisional- 
MIHNP is useful for constructing PRNGs. 

Decisional- MIHNP: An instance of the D-MIHNP problem is as follows: let p 
be a fixed m-bit prime and fc, n be positive integers. Let a be a random hidden 
element of Zp. We are given p and k. The problem is to distinguish the following 
two ensembles: 
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I , x„,MSBfc(^^) j and 

I Xi,MSBfc(n), X„,MSBfe(r„) 

where a, xi, . . . , ri, . . . , r„ are chosen uniformly at random in Zp. The S- 
DMIHNP assumption states that no polynomial time algorithm can distinguish 
these two ensembles with non-negligible advantage whenever k < 6m. 

As before, we cannot reduce this problem to either of the previous problems, 
but we know of no algorithms for D-MIHNP, other than first finding the hidden 
element a. In a sense, it seems that the tools that we have for designing algo- 
rithms for these problems are too crude to distinguish between these variants. 

This situation is somewhat analogous to the situation with the various 
discrete-logarithm assumptions. The basic MIHNP can be viewed as an ana- 
log of the Discrete-Log Problem (DLP): given g°‘ mod p find the hidden number 
a. Just as DLP is often insufficient for cryptographic constructions, we need 
stronger assumptions that the basic MIHNP for the constructions in this paper. 
The C-MIHNP can be viewed as an analog of the Computational Diffie-Hellman 
assumption (CDH), and D-MIHNP is the analog of the Decision Diffie-Hellman 
assumption (DDH). As is the case with the various MIHNP problems, we also 
do not have reductions between the various discrete-log problems, yet the only 
algorithms that we know for solving any of them involve solving discrete-log. 



2.1 Random Self Reduction for MIHNP 

The MIHNP problem has a simple limited random self reduction among instances 
modulo the same prime p. The reduction shows that for a prime p if finding 
a € Zp is hard for a worst case a then it is also hard for a random a € Zp. 

Suppose there is an algorithm A that solves the Basic-MIHNP problem with 
probability e, where the probability is taken over the choice of the Xi’s and 
also over the choice of a. We show that this implies an algorithm B for solving 
Basic-MIHNP that works for any fixed a with probability e, where this time the 
probability is over the choice of the Xi’s only. 

Given an instance of Basic-MIHNP, (xi, pi) , i = 1, . . . ,n, algorithm B picks 
a random r G Zp, and runs algorithm A on the Basic-MIHNP problem defined 
by the tuples {xi + r,yi) , f = 1, . . . , n. 

Note that if the original MIHNP instance corresponds to the hidden number 
a, then the new instance will correspond to the hidden number a' = a — r, which 
is random and independent of the Xi’s. It follows that with probability e, the 
algorithm A indeed returns a' , and then B can add back r to recover a. 

We call this a limited random self reduction, since we only randomize the 
solution a, and not the elements xi,...,Xn- The computational MIHNP and 
decisional MIHNP have similar limited random self reductions. 
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3 Security Analysis of the MIHNP 

In this section we analyze the security of MIHNP. We show how to apply the 
currently known technology in algebraic cryptanalysis to MIHNP, and demon- 
strate the limitations of that technology when applied to this problem. We know 
of no better way to distinguish the pairs (xj, MSBfe((a -b from random, 

other than to actually recover the secret a (and use the knowledge of this to 
verify the bits), and so this is the problem we address. That is, we assume that 
we have a system of equations 

(a + Xt)(bt + €i) = 1 (mod p) i = 0,...,n, (1) 

where a € Zp is the (large, secret) variable we aim to discover, the Xi’s are known, 
but randomly chosen elements of Zp, the bi’s are the known most significant 
bits, and the are variables that correspond to the unknown low order bits, 
so we have \ci\ < 2™“^ for all i. Observe that once we find any of the e^’s we 
can discover the secret a immediately, from the fact that a = l/(bi + ei) — Xi 
(mod p). However, as we shall see, typically we find all the simultaneously, or 
none at all. 

We attempt to solve MIHNP using lattice techniques. We set up a lattice 
that incorporates the relations from Eq. (1), so that the bound on the size of 
the €i’s will correspond to some small vector in the lattice. If we can make the 
argument that this vector is by far smaller than any other vector in this lattice, 
then we could use the LLL lattice reduction algorithm [14] to find it, thereby 
recovering the £j’s. This framework was used in [4] to solve the original HNP. 

Looking at Eq. (1), however, we find that these relations cannot be used 
directly to set up a lattice. The reason is that each of these relations has a 
term of the form a • Ci, where a is unbounded (i.e., it can be as large as p), 
and the Cj’s change from one relation to the next. To use in a lattice, one must 
first “linearize” these relations, and doing so would introduce a new unbounded 
variable for each of the products aci. (We stress that current technology has no 
problem handling either changing small unknowns such as the Cj, or fixed large 
unknowns such as a. It is the product of the two that makes this problem hard.) 

We are therefore forced to eliminate the unknown a from the relations of 
Eq. (1), before we can use them to set up a lattice. Given the n + 1 relations 
from Eq. (1), we eliminate the unknown a, and produce n relations of the form: 

(xi- xo)(bo + eo)(bi + €t) - (bo + eo) + (bi + Ci) = 0 (mod p) (2) 

These relations are already in a form that is amenable for use in a lattice, 
and we can apply to them (an extension of) the techniques from [6,12], as we 
now explain. We start by re-writing the left hand side of Eq. (2) as a polynomial 
in the unknowns cq and £j, namely: 

/i(eo,Ci) “== (xi-xo)eo£i + (bo(xi - xo) + l)ei + (bi (xi - xo) - l)eo + (bobi(xi - xo)) 

Notice that the coefficients of this polynomial are known to us (since we know 
all the bi’s and Xi’s), and therefore we can set up a lattice based on their values. 
To simplify notation, we denote below fi(eo, Ci) = AiCoCi + BiCi + CiCo + Di. 
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3.1 First Attempt: A Linear Approach 

As a first attempt at a solution, we set up a lattice of dimension 3n+2 as follows. 
The lattice is spanned by the rows of a real matrix M that has the following 
general structure: 




where E and P are diagonal matrices of dimensions (2n + 2) x (2n + 2) and 
n X n, respectively, and i? is a (2n + 2) x n matrix. Each of the first 2n + 2 
rows of M is associated with one of the terms in relations from Eq. (2) (i.e., the 
constant term, the terms e^, and the terms epej), and each of the last n columns 
is associated with one of the n relations. 

The matrix R incorporates the relations themselves. The (i,j) entry in this 
matrix is just the coefficient in the j’th relation of the term corresponding to 
row i. The diagonal entries of the matrix P are all equal to p, and the diagonal 
entries of the matrix E correspond to the bounds on the terms associated with 
each row. Specifically, if the term which is associated with row i is bounded 
by B, then entry (i,i) in E is equal to l/B. That is, the row corresponding to 
the constant term has diagonal entry 1, rows corresponding to 6i have diagonal 
entries 1/2™“*, and rows corresponding to egej have diagonal entries i/22(™-^). 
An example for the matrix M for n = 2 is given in Figure 1. 
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Fig. 1. The matrix M for the case n = 2. 

We can now view each one of the relations of Eq. (2) as holding over the 
integers, by explicitly introducing the appropriate multiple of p. Namely, we 
have: 

+ Bi€i + CjCo + Di p ■ Hi = 0 (3) 

From the way we constructed this system of n polynomial relations, we know 
that it has an integer solution €i = Ci, Ki = ki in which all the Cj’s are bounded 
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below 2™ Let v be a (3n + 2) integer vector containing the values of all the 
terms in our system of equations, according to this solution. Namely, we set 

V ^1, Cq, ■ ■ ■ , Cti, CoCi, ■ ■ ■ , ? ^n) 



It follows that for this integer vector v we get: 



jir /, Co Cn CoCi egCn 

V ■ IVl — > • ■ • > 2m-fc ’ 22(™-fc) ’ ' ’ ' ’ ^ 



Thus, the lattice point • M has only 2n + 2 non-zero entries, and each of these 
is less than 1, so its Euclidean norm is less than \/2n + 2. 

On the other hand, it is easy to see that the determinant of the lattice 
L{M) equals p"y'2(m-fc)(3n-i-i)^ j^aking use of the Gaussian heuristic^ for short 
lattices vectors, we expect that our vector is the shortest point in L{M) as long 



/2n + 2 < V3n -b 2 



l/(3n-|-2) 



Whenever this condition is met we will assume that an adversary can recover the 
vector V using lattice reduction methods such as LLL (although we note that in 
practice, the adversary may not find this that easy unless v ■ M is the shortest 
vector by a substantial margin) . 

Substituting p « 2™ into Eq. (4) and ignoring low-order terms, this condition 
is simplified to 2* ^ Therefore, this method can only be used when the 

number of bits of l/{a + Xi) that we see is more than 2m/3 (alternatively, when 
the number of bits that we are missing is less than m/3). This gives an algorithm 
for Basic-MIHNP when S > 2/3. 



The dimension of the lattice. We remark that the same bounds (but no better) 
could also be achieved from a lattice of smaller dimension that utilizes the fact 
that V ■ M has n trailing zeros. However in this analysis we are only interested 
in showing that there exist bounds on m even for lattices of arbitrary dimension 
(i.e. we effectively allow the adversary the power to reduce lattices of arbitrary 
dimension). This means we can ignore efficiency issues regarding the dimension 
of the lattice, and opt for the easier way to describe and extend the lattices 
(as above). This assumption is particularly important to note in the subsequent 
section where the dimension grows exponentially in n. 



3.2 Making Use of Multiples 

To improve upon the bound of m/3, we apply a technique due to Coppersmith 
[6] to make better use of the relations in Eq. (2). Namely, instead of using only 
these relations in our lattice, we can use also relations that are derived by taking 

^ In fact, it is possible to prove rigorously, that when the Xi’s are chosen at random, 
the vector v ■ M is (with high probability) the shortest vector in L{M). This proof 
will appear in the full version of this paper. 
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products of them. For example, since we have /i(eo, ei) = 0 and /2(eo: £2) = 0, we 
also know that e2/i(eo, £i) = 0, and also /i(eo, ei)-/2(eo, £2) = 0. Moreover, since 
the original relations hold modulo p, then the last relation holds also modulo . 

Of course, these additional relations introduce new terms that were not 
present in the original one. (For example, the relation /1/2 = 0 from above 
has a term eo£i£2, which we did not have in the original system.) Nonetheless, 
we hope that weighing the additional relations against the additional terms, we 
would be able to get a better result. Hence, our goal here is to add as many 
relations as possible, while keeping the number of additional terms as small as 
possible. 

Once we decide on a set of relations to use, we construct the lattice in exactly 
the same way as above. Namely, if we have r relations and t terms, we construct 
a {r + t) X {r + t) matrix M with the same structure as above. That is, the top 
left t X t sub-matrix E is diagonal with entries that correspond to the (bounds 
on the) different terms, to its right we put atxr matrix R that corresponds to 
our relations, and at the bottom left we put a diagonal matrix P that would take 
care of the modular reductions. One difference is that now, if the i’th relation 
holds modulo p*, then the corresponding diagonal entry of P will be p* (rather 
than just p). 



Constructing a lattice. The key aspect of this approach is to choose which 
relations to put in the lattice, and to analyze the parameters achieved by this 
lattice. Below we think of the process of adding relations to the lattice as hap- 
pening in phases. In phase d, we add to the lattice relations that are obtained 
by multiplying up to d of the original relations. These new relations look like 
fii ' ' ' fid = 0 mod p'^ for some 0 < zi, . . . , < n. 

We note that once we have in the lattice some relations (and all their terms), 
we might as well add other relations that use only terms that already appear in 
the lattice. For example, if we have the relation /1/2 = 0 in the lattice, we might 
as well also add the relation ei/2 = 0, since every term that appears in ei/2 must 
already appear in /1/2 (because fi includes the term ei). Therefore, once we have 
/1/2 and all its terms, we can add ei/2 “for free”. The only exception is that 
we have to make sure that the relations in the lattice are linearly independent. 
For example, once we have in the lattice the relations /2, £0/2, £1/2 and /1/2, 
we cannot add also the relation £o£i/2) as it is linearly dependent on the other 
relation, by the equality /1/2 = Hi£o£i/2 + .Bi£i/2 + C'i£o/2 + -D1/2. 

Notations and conventions. In the analysis below we talk about the “weight” of 
relations or terms. The weight of a relation is the number of original relations 
that are multiplied. For example, the relation ei/2 = 0 has weight 1, and /1/2 = 0 
has weight 2. We note that if a relation has weight z, then this relation holds 
modulo p* (but not necessarily modulo p*^^). The weight of a term is just its 
degree. For example, the weight of £q£i is 3. With this notation, the determinant 
of the lattice is proportional to the total weight of all the relations, and inversely 
proportional to the total weight of all the terms that are used in these relations. 
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More precisely, when p is an m-bit prime, and the bound on the e^’s is 2™ 
(i.e., the number of bits of l/{a + x) that we see is k), the determinant of the 
lattice is roughly 2™'’'''®*S^*'belations) — (m— fc)-weight(terms) 

Recall that our goal is to maximize the determinant (since this would im- 
ply that the lattice is unlikely to have short vectors, other than the one cor- 
responding to our solution). Below we show, however, that we always have 
weight (relations) < 2weight (terms) /3. Therefore, to get det{L) > 1, we must 
have m — k < 2ml2>. 

The relations. In this analysis we assume that the number n of the original 
relations can be made as large as we want. Since we aim to show that the 
approach is bound to fail beyond 2m/3, we can make this assumption without 
loss of generality (as adding relations can only help the algorithm). When we 
analyze phase d, we assume that d, so that we get a good approximation of 
the sum (^) by taking just the last term, (()) . 

The relations that we add in phase d are all the (()) relations of weight d, 
that are obtained by multiplying d distinct relations (from the original n), and 
then adding all the relations that are now “for free”. We again note that since 
n 3> d, then a vast majority of the relations in phase d are of this form. 

Analysis. We start by analyzing the weight of the terms. Since each has all 
the possible terms for a multi- linear function in eoj £i) it follows that a product 
of d distinct fi’s have all the possible terms with degree at most d in eo, and at 
most 1 in all the other e^’s. 

We group the terms according to the number of e^’s other than eg in them. 
Clearly, we have exactly (d-b l)(p terms with exactly j e/s other than cq. (We 
have (p ways to choose the Cj’s, and then cq can have any degree between 0 and 
d.) The weight of these terms ranges from j (if the degree of cq is 0) to j -b d (if 
the degree of eo is d). Therefore, the total weight of all the terms is 

weight(terms) = X! ( (j^ • O' + (j + 1) H ^ 0 + d)) 

Recall now that we assume that n is large enough with respect to d, so that 
(P “ (d)(^ + c(l))- This implies that also 

weight(terms) = + l)(3c?/2)(l -bo(l)) 

By the same argument, the number of terms is (d -b !)(()) (1 -b o(l)). 

We now proceed to analyze the weight of the relations. First, observe that 
we cannot have more relations than terms in the lattice, since otherwise we 




44 



D. Boneh, S. Halevi, and N. Howgrave-Graham 



get linear dependencies. Thus, there are at most (d+ 1)(^)(1 + o(l)) relations. 
Moreover, the weight of each of these relations cannot be more than d, since in 
phase d we only multiply up to d of the /j’s. Therefore, the total weight of all 
the relations is bounded by 



weight(relations) < d • (d+ 1) 




(1 + 0 ( 1 )) 



In fact, it is possible to show that this bound is tight, and the total weight of 
the relations that we get is at least d^(^). We conclude that in our lattice we 
must have 

weight (relations) ^ d- {d+ 1) (") (1 + o(l)) ^ 1 + o(l) 

weight(terms) “ (()) (d + l)(3d/2)(l + o(l)) ~ 3 

(We remark that a more careful analysis can even show a bound of 2/3 — o(l).) 



3.3 Conclusions from the Analysis of MIHNP 

We showed that the Basic-MIHNP problem can be efficiently solved whenever 
we are given more than 1/3 of the bits of {a+Xi)~^ mod p. The analysis does not 
extend beyond 1/3. (Moreover, near 1/3 the dimension of the lattice makes it 
completely infeasible to reduce.) For this reason, we conjecture that this problem 
is hard when we are given less than 1/3 of the bits even if a large number of ran- 
dom samples Xi are given. That is, we conjecture that the d-MIHNP assumption 
holds whenever S < 1/3. 



3.4 Other Variants of MIHNP 

The tools that we devised to analyze the MIHNP can be used also to analyze 
similar problems. For example, in Section 4 we will be interested in a problem 
where we are given pairs {xi,Pj{a + Xi) modp), i = 1 . . . n, and we need to 
recover both a and [3. The corresponding relations that we get are 

{a + Xi){bi + €i) = (3 (mod p) z = 0, ...,n. 

Again, we have a problem with the terms acj, but when we eliminate a as before, 
we would get terms (3ei. Hence, to be able to set up a lattice we need to eliminate 
both a and [3. More generally we may consider relations of the form 

r 

Ri : {xio + j/io£i) + '^{xij + Vijei)aj = 0 
1=1 

where the Xjj’s and j/y’s are random and known, the a^’s are unknown and 
unbounded, but common to all these relations, and each Ci is an unknown unique 
to relation i, but for which we have some bound. As before, the terms eiaj 
cannot be handled by standard lattice reduction techniques, so we need to first 
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eliminate the a^ ’s. We now show that if we need to eliminate r such “unbounded 
variables”, then the lattice-reduction techniques from above can only be used 
when k/m > r/{r + 2) (i.e., the number of hidden bits is less than m- Note 
that for MIHNP we have r = 1, and indeed we got the bound k/m > 1/3. For 
the case above of two variables, we get k/m > 1/2. Hence, this problem is harder 
than MIHNP in the sense that it can only be solved when 5 = k/m > 1/2. 

Assume that we are given given n + r relations. We can set a linear system in 
the r unknowns a\ . . .ar and r relations Rn+i, ■ ■ ■ , Rn+r, solve for the unknowns, 
and then substitute the solution in all the other n relations i?i . . . (each time 
multiplying by the common denominator, to get a polynomial relation rather 
than a rational one).^ 

Using Cramer’s rule for the solution of a linear system, it is easy to verify 
that the terms that we substitute for the a^’s are multi-linear in e„+i . . . Cn+r- 
Hence, after eliminating the “unbounded variables”, we are left with n relations 
/i = 0, z = 1 . . . n, where fi is a multi-linear relation in Cj, e„_|_i . . . Cn+r- These 
relations are the ones we use to set-up a lattice. 

As we did for MIHNP, we set-up a lattice not only using the /i’s themselves, 
but also using products of them. As before, we use relations that we obtain by 
multiplying d distinct /i’s (for some parameter d, and under the assumption that 
d). A product of d such /i’s is a relations 

p{^ii ) £^2 ) ^idi ^n+1) ■ • ■ J ^n+r) — 0 

where p is multi-linear in the e^’s, and has degree d in e„+i . . . e„+r. We want 
to count the total weight of the terms and relations in this lattice. As we know, 
if n ^ d, then it is sufficient to consider only these terms that include exactly 
d distinct Ci’s, other than e„+i . . . e„+r. So there are ((J) ways of choosing the 
£i^. ’s, and for each choice we have (d -I- 1)” possible combinations of the degrees 
of e„+i . . . e„+r. Namely, for a specific choice of eq , , ..., the terms that we 

get are exactly all the terms in the expression 

(Ui • U 2 ••• Ud) • ((1 + Cn+l H 1" £n+l) ••• (1 + En-l-r H Sn+r)) 

This means that for this choice of eq’s, we have (d -I- 1)” terms, and the weight 
of these terms vary between d and d -I- rd. The total weight of all these terms is 

d d d 

EE-- E {d+ki + k 2 + ...kr) = (d+l)”-(d + rd/2) 

fcl— 0/^2— 0 kr—0 

Therefore, we have ((J)(d + 1)” terms, of total weight ((^)(d + 1)” • d(l + r/2). 
On the other hand, we cannot have more relations than terms, and the weight 
of a relation cannot be more than d, so the total weight of the relations is 

^ Clearly, this is not the only way to eliminate the unbounded variables. For example, 
we can solve different sets of relations for these unknowns, depending on the relation 
to which we want to substitute. However, tracing through the arguments below, the 
method we use here seems to give the smallest number of terms. 
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at most {^){d + 1)’’ • d. (This bound is tight, since it can be shown that for 
random relations, the total weight is at least {^){d + 1)’’ • {d — r).) Recall that 
the determinant of our lattice is roughly 2 ’” ™s*'*heiations)-(m-fc) weight(terms)^ 
get the determinant above 1, we therefore must have 

^ (d + l)"d > (m - fc) • Q {d + iyd{l + r/2) 

which means that m > (m — fc)(l + r/2), or k/m > r j(r + 2). 




4 Cryptographic Applications 

The apparent intractability of MIHNP, suggests that it may be useful as the basis 
for cryptographic applications. Indeed, we show below how to use the decision- 
MIHNP assumption and the computational-MIHNP assumptions, respectively, 
to get an efficient pseudorandom generator and a MAC. 

4.1 Pseudorandom Generator 

The decision-MIHNP immediately suggests a construction of a PRNG. The input 
to this generator would be “the secret” a, and n random points xi..Xn S Zp. 
The output would be the points xi..x„ together with (say) 1/4 of the bits of 
l/{a + Xi) mod p for all i. More precisely, we have the following system: 

Parameters. The parameters of the system include an m-bit prime p, and two 
other parameters, n and k, where k specifies how many bits of l/(a + x) we 
output, and n specifies how many x’es we have in the input of the generator. 
These parameters are discussed in more details below. 

The generator. On parameters p, n and k, the generator input is a sequence 
{x \, .., Xn, a) of n + 1 elements in Zp. The output is the sequence 

G{a,Xi,..,Xn) ( Xi, ...,X„,MSBfc( ),...,MSBfe( ^ ) 

V a + xi a + Xn 

The security of this generator follows immediately from the decision-MIHNP 
assumption. We note that this is a pseudorandom number generator, but not 
pseudorandom bit generator, since the output distribution is not the uniform 
one. There are standard techniques for transforming this to a pseudorandom 
bit generator. Any of a number of standard extractors could be used for this 
purpose [16,11]. 

Proposition 1. Under the D-MIHNP assumption, G is a secure pseudorandom 
generator. 

One point worth mentioning is the re-keying of the generator from the previ- 
ous output. It is well known, see [3], that it is secure to do this, if the underlying 
generator is itself secure. In our case this means that we may fix the x\, ...,Xn 
once at the start of the whole procedure, and then use just the MSBfc(l/(a-|-a;i)) 
part of the output to re-key a and form the output bits of the PRNG. 
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Parameters and performance. The parameters m (the size of the prime p) 
and k (the number of bits to output from each l/(a + Xi)) must be chosen such 
that solving the MIHNP with k output bits modulo a prime of size |p| = m is 
infeasible. More precisely, if we assume that the threshold for feasible solution is 
when the adversary sees > m/3 of the bits of l/(a + Xi), and we want security 
level of 2’’, we need to make sure that our generator outputs at most m/3 — r 
of the bits. This means that we have a tradeoff between the number of the bits 
that we output (which is related to the expansion of the generator) and the size 
of the prime that we work with. 

A reasonable setting is to set r = fc (i.e., output as many bits of l/(a + Xi) 
as our security parameter). With this setting, we should choose m so that k < 
m/3 — k, namely m > 6k. (Another constraint is that to get security level 2’’, we 
must hide at least 2r bits of l/(a + Xi), to avoid birthday-type attacks. In the 
current setting, however, this constraint is subsumed by the previous one.) An 
invocation of the generator G stretches a random input of length (n -|- l)m bits, 
into a pseudorandom output of length n{m + k) bits. Hence, each invocation 
generates nk — m pseudorandom bits. 

For a numerical example, assume that we want to get security level of 2®*^. 
We then set m = 6 • 80 = 480 and fc = 80 (i.e., we work with a 480-bit prime, 
and output 80 of the bits of l/(a-|-a;j)). With these parameters, each invocation 
of G generates 80n — 480 pseudorandom bits (so we must choose n > 6 to get 
any expansion). In our example below we use n = 10. 

A naive implementation of this generator would require n modular inversions 
to compute MSBfc( ^_^^ ), i = l...n. Therefore, the cost of this implementation 
is roughly fc bits per inversion (for a sufficiently large n). Keeping with the 
numerical example above, choosing, for example, n = 10, the size of the seed 
(which is the amount of state we keep) is 4800 bits (= 600 bytes), and we get 
nk — m = 320 pseudorandom bits at the cost of 10 inversions, or 32 bits per 
modular inversion. Keeping a larger state results in more bits per inversion. For 
example, setting n = 20, we have 9600 bits (= 1200 bytes) of state, and we get 
1120 bits at the cost of 20 inversions, which is 56 bits per inversion. 

Even this naive implementation is already quite fast. With a careful imple- 
mentation, the cost of modular inversion can be as small as only a few multi- 
plications [1]. Moreover, since we work in a relatively small field, the operations 
can be quite fast. Finally, we note that the modular inversions are independent 
of each other, so it is trivial to parallelize this computation. 

Speedup via batching. One way to speed up the computation, is to trade 
modular inversions for multiplications by using batching. The idea, first discov- 
ered by Peter Montgomery, is as follows: To compute l/(a -I- Xi),i = l...n, we 
first compute the product tt = then invert only this product to get 

7T“^ mod p, and finally compute l/(a-|-a;j) = tt~^ 'Y[i^j(o, + Xi). It is not hard to 
see that one can compute all the values l/{a + Xi) using only 3(n— 1) multiplica- 
tions and one modular inversion. (For this, one needs to keep in memory up to n 
intermediate values during the computation.) If inversion is more expensive than 
three multiplications (as is the case for all the multi-precision software libraries 
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that we know), then this implementation will be more efficient than the naive 
one. 

Back to our numeric example, with n = 10 we get 320 bits for one inversion 
and 28 multiplications, which is about 11 bits per multiplication. With n = 20 
we get 1120 bits for one inversion and 58 multiplications, which is roughly 19 
bits per multiplication. Hence, our generator is more efficient than other alge- 
braic generators, e.g. the pseudorandom generator due to Gennaro [9] which is 
based on the problem of discrete-log with small exponent. The generator of [9] 
generates approximately one pseudorandom bit per multiplication. Furthermore, 
Gennaro’s generator uses a much larger prime field. Other algebraic generators, 
such as the Blum-Blum-Shub generator [2], generate a small number of pseudo- 
random bits per multiplication modulo a much larger modulus than the one we 
use. The exact comparison of our generator to BBS depends on the number of 
bits per round output by the BBS generator. 



Even faster variants. We can increase the speed even further by slightly 
modifying the generator itself. Below we describe two such modifications. 

Re-defining the output. To speed the batching implementation, we change the 
output of the generator, so that it would be easier to compute this output from 
the intermediate value tt~^ that we get during the computation. Specifically, we 
set 

Fr(u,Xi,..,Xyi) — |xi,..., Xji , MSB/„ ( ) , .. ., MSB/„ ( ) 

\ '^1 '^n 

where the tt^’s are defined by TTj — 

We stress that the security of G' does not seem to be equivalent to our 
original D-MIHNP problem. Rather, this generator defines yet another variant of 
D-MIHNP. Still, the analysis from Section 3 applies in exactly the same manner 
to this variant too. 

Implementing G' using the batching technique takes only 2m — 1 multiplica- 
tions and one inversion (and can also be parallelized easier than with G). Hence, 
in our numerical example we get 10 bits per multiplication for n = 8, or 28 bits 
per multiplication for n = 20. 

Using a harder MIHNP problem. Another possibility is to use harder variants 
of MIHNP. For example, instead of using /„ (x) = l/(a -I- x) as our underlying 
function, we can use fa,p(x) = /3/{a x) where both a and ft are secret. 

(We mention in passing that just like the original MIHNP, this variant too 
has limited random self-reducibility (in a and /?). This is because we can set 
Vi = {xi -I- s) • and then we have For any fixed a,/3 (with 

/3 yf 0), if we choose r, s uniformly at random (with r yf 0) then ar -\- s, Pr are 
uniformly random and independent.) 

From the analysis in Section 3.4, it follows that this problem is infeasible to 
solve when the number of “missing bits” is more than m/2 (as opposed to 2m/3 
for the original MIHNP). This means that we may be able to output as many 
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as to/ 2 — r bits for security level of 2’’. Assuming that we still set r = k, this 
argument suggests that we must set to (the size of our prime) so that k < m/2—k, 
or TO > 4fc. 

Each invocation of the generator now stretches m(n + 2) random bits into 
n{m + k) pseudorandom bits, so we get nk — 2m pseudorandom bits per invoca- 
tion. For a numerical example, to get security level of 2®°, we choose to = 320, 
and fc = 80 (i.e., work with a 320-bit prime and output 80 bits of j3/{a + Xi)). 
Working with n = 10, we have 3840 bits of state and 160 bits per invoca- 
tion (same as for the n = 8 example from above). However, this generator 
does roughly 25% more operations, but in a smaller field (|p| = 320 instead of 
IpI = 480), so we expect it to be nearly twice as fast. Similarly, using n = 22 we 
get 1120 bits per application with state of 7040 bits, which is the same number 
of bits per invocation, and somewhat smaller state than the n = 20 example 
above. Again, we do 10% more operation over a smaller field, so we expect the 
overall running time to be roughly twice as fast. 



4.2 Message Authentication Code 

The computational-MIHNP directly implies an efficient “weak MAC”, secure 
under known (random) message attacks. The parameters p and k are chosen 
just as for the generator, and the secret MAC key is an element a G Zp. To 
authenticate a message y, one adds the authentication tag 

MACa(x) = MSBfe(^— ) 
a + X 



Proposition 2. Suppose the S- Computational MIHNP assumption holds. Then 
when k < <5(log2p) the above MAC is secure under known (random) message 
attacks. 

The proof is immediate. The cost per MAC computation is thus just one 
modular inversion. Moreover, if we need to compute MAC for many messages 
xi...Xn, we can use the same batching trick from the previous section to speed 
up this computation. 

The “weak MAC” above can be converted to a MAC secure against chosen 
message attack, using standard techniques. For example, one could apply the 
MAC to a random string r and then use a one-time signature based on r to 
sign the message x. However, these generic conversion techniques (from security 
against a known message attack to security against a chosen message attack) 
make the MAC much less efficient. We do not know whether the MAC from 
above is by itself secure against chosen message attack. This would require a 
version of the computational MIHNP assumption, where the Xi’s can be chosen 
by the attacker. Currently we cannot tell whether this chosen message MIHNP 
problem is intractable. 
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5 Conclusions and Open Problems 

In this paper we proposed the MIHNP, and variants thereof, as new and poten- 
tially hard mathematical problems. We presented a few efficient cryptographic 
constructions based on these problems. To justify the hardness of these MIHNP 
problems we used the most up-to-date lattice analysis techniques to solve MI- 
HNP and even allowed the attacker the power to reduce infeasibly large lattices. 
Our best algorithm works whenever the fraction of given bits is greater than one 
third of the length of the modulus. However, the lattice based approach does 
not extend to solve the MIHNP when less than a third of the bits is given. We 
therefore conjectured the MIHNP is hard in this case. MIHNP is an interesting 
and efficient building block for cryptographic systems. It clearly deserves further 
study. 

One particularly interesting question to answer is how much easier the MI- 
HNP problem becomes if the Xi are not randomly chosen, but adversarially 
chosen. If the Computational-MIHNP remains hard when the XiS are chosen 
adversarially then we obtain an efficient MAC from MIHNP. Also, it is very 
interesting to see whether any non-lattice approaches shed any light on the 
hardness of these MIHNP problems. 

Lastly we mention that the analysis we have used for MIHNP, can be heuris- 
tically applied to certain modular polynomials arising from using the Diffie- 
Heilman protocol with elliptic curves (ECDH). As was recently done in [5], we 
may apply our results to proving statements on the bit security of ECDH. Specif- 
ically, to prove the bit security of ECDH on a specific curve E, it is sufficient to 
solve the following hidden-number problem (called ECHNP): We are given 

(xi,yi), MSBk{(- — —] -Xi~x) 

\X-XzJ 

for many random points (xi,yi) G E, and we need to find the hidden point 
(x,V') G The (heuristic) analysis from Section 3.4 can be applied to this 
problem too, and it suggests that ECHNP can be solved for S > 3/5. This would 
mean that given an algorithm that computes the top 3/5 fraction of bits in (the 
x-coordinate of) the ECDH secret, one can devise an algorithm to compute all 
the bits. However, since the analysis in Section 3.4 is only a heuristic, one does 
not immediately get a proof of bit-security for ECDH. 

We leave further details to a subsequent paper, but mention that while we 
were able to convert the heuristic analysis into a formal proof in some cases, the 
result that we get is very weak: We can only prove that for some small constant 
e « 0.02, computing a (1 — e) fraction of the bits in the ECDH secret is as hard as 
computing them all. This is related to a recent result of Boneh and Shparlinksi 
[5] which shows that if ECDH is hard on some curve E then there is no single 
efficient algorithm that predicts one bit of the ECDH secret for many curves 
isomorphic to E. Our result applies to blocks of bits (rather than a single bit), 
but is stronger than [5] in the sense that it applies to a specific curve rather 
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than a family of curves. We show that if ECDH is hard on a specific curve then 
the top (1 — e) fraction of the bits of the ECDH secret on that curve cannot be 
efficiently computed. 
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Abstract. One interesting and important challenge for the cryptologic 
community is that of providing secure authentication and identification 
for unassisted humans. There are a range of protocols for secure iden- 
tification which require various forms of trusted hardware or software, 
aimed at protecting privacy and financial assets. But how do we verify 
our identity, securely, when we don’t have or don’t trust our smart card, 
palmtop, or laptop? 

In this paper, we provide dehnitions of what we believe to be reason- 
able goals for secure human identification. We demonstrate that existing 
solutions do not meet these reasonable definitions. Finally, we provide 
solutions which demonstrate the feasibility of the security conditions at- 
tached to our definitions, but which are impractical for use by humans. 



1 Introduction 

Consider the problem of human identification. A human H wishes to prove his 
identity to a computational device C . The channel over which H and C will 
communicate is insecure and possibly controlled by an adversary. The protocol 
which accomplishes this task must satisfy the property that no adversary, even 
one who has witnessed past identifications, may successfully impersonate H ex- 
cept with negligible probability. Complicating matters further, H and C would 
like to reuse the secret they share for many identifications. 

This problem arises on a daily basis in our society, yet the solutions to date 
are inadequate for several reasons. The traditional password approach is unac- 
ceptable, since a network snoop can record the password and will then be able 
to falsely authenticate as the user at will. Schemes which build a cryptograph- 
ically strong key from some initial weak secret, such as SRP and EKE, require 
trusted hardware and software, since the computations involved are far beyond 
the abilities of most humans. Zero-knowledge schemes such as Fiat-Shamir [1] 
require trusted hardware which can be stolen or compromised. One-time pass- 
words [2] are just that - good for only a single authentication; pads of such 
passwords are vulnerable to theft and still require a large ratio of “key material” 
to authentications. 

These schemes all require the human to have some computational or memory 
aid to securely authenticate himself. In this paper we seek a solution that is viable 
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for the traveler who lost his luggage, or the purchaser who forgot his wallet. We 
believe that practical scenarios such as these justify the need for such a solution. 

An alternative to the above schemes (SRP, EKE, Fiat-Shamir, one-time pass- 
words) is a challenge-response protocol: 

— The user and computer share a secret. 

— The computer randomly challenges the user 

— The user responds in such a way that an adversary cannot easily learn the 
secret. 

Papers by Matsumoto and Imai [3], Wang et al [4], and Matsumoto [5] provide 
schemes which are sufficient for a small number of authentications. In their case, 
the secret can be recovered in polynomial time once a linear (in the size of 
the secret) number of authentications have been witnessed by an eavesdropper. 
(In our case, the number of authentications that must be witnessed to recover 
a secret in polynomial time is quadratic in the size of the challenge, which in 
turn is superpolynomial in the size of the secret.) Naor and Pinkas [6] give an 
identification protocol which is secure for a number of identifications which is 
linear in the size of the challenge and which requires a low-tech hardware item: a 
transparency. If stolen, the transparency can be copied and used to masquerade 
successfully as the legitimate user. 

It is the goal of this paper to suggest that protocols which allow unaided hu- 
mans to identify themselves securely and repeatedly may be feasible and should 
be a goal of the cryptographic community. In Section 2 we provide security defi- 
nitions which we contend should be the goal of human identification protocols. In 
Section 3, we give examples of some cryptographic primitives which humans can 
execute without assistance. Section 4 gives a protocol which is provably secure 
against eavesdropping adversaries, based on these primitives; Section 5 outlines 
a protocol which is heuristically secure against arbitrary adversaries. This pro- 
tocol is composed of a small number of steps that are individually feasible for 
humans. As a whole, however, the protocol requires too much computation (and 
possibly too much memory) to be practical for most humans. 



2 Definitions 

We begin by formally defining the notion of an identification protocol, and what 
we will mean for a protocol to be human executable. We then define two notions 
of security, in terms of passive and active adversaries. Finally we show how some 
traditional solutions to this problem either fail to satisfy the conditions of human 
execution or security. 



2.1 Human Identification Protocols 

We follow [7] in defining a protocol as a pair of (public, probabilistic) interact- 
ing programs {H, C) with auxiliary inputs; we denote the result of interaction 
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between H and C with inputs x and y as (H{x), C{y)) and we denote the tran- 
script of bits exchanged during their interaction by T{H{x),C{y)). A protocol 
yields some form of identification if H and C accept with high probability when 
run with the same auxiliary input and reject with high probability when run 
with different auxiliary input. 

Definition 1. An identification protocol is a pair of probabilistic interactive 
programs {H, C) with shared auxiliary input z, such that the following conditions 
hold: 

— For all auxiliary inputs z, Pr[{H (z) , C (z)) = accept] > 0.9 

— For each pair x ^ y, Pr[{Fl{x),C{y)) = accept] < 0.1 

When {H, C) = accept, we say that FI verifies his identity to C, C authenticates 
FI, or FI authenticates to C . 

In this paper we are interested in the case where FI can be executed by a 
human. For the reasons outlined in Section 1, we rule out any form of com- 
putational aid. Additionally, we allow for occasional human error and varying 
abilities of the human population: 

Definition 2. An identification protocol (H,C) is said to be {a,j3,f) - human 
executable if at least a (1 — a) portion of the human population can perform the 
computations FI unaided and without errors in at most t seconds, with probability 
greater than 1 — /3. 

An ultimate goal might be to design a (.1, .1, 10)-human executable identifi- 
cation protocol that also meets the security definitions defined subsequently; the 
protocols we give here are on the order of (.9, .2, 300)-human executable, which 
is clearly not practical as a replacement for traditional solutions to the problem. 
Still, since they meet our security conditions, we believe they provide evidence 
that such a protocol is feasible. 

A practical issue concerns whether the claim “{F[,C) is (a, /3, t)-human ex- 
ecutable” can be demonstrated. Since we lack a well-defined model of human 
computation, establishing the claim rigorously seems infeasible in most cases. 
However, we believe that for the present, in many cases such claims can be eval- 
uated intuitively. In cases where they cannot, empirical evidence should suffice. 



2.2 Security Definitions 

We give both a weak characterization of security, in terms of passive adversaries, 
and a strong characterization of security, in terms of active adversaries. Both 
characterizations are parameterized by a pair (p, k) where p gives the probability 
that a computationally bounded attacker can successfully simulate FI to C after 
k interactions with H and/or C. 
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Definition 3. An identification protocol {H, C) is {p, fc)-secure against passive 
adversaries if for all computationally hounded adversaries A, 

Pr[{A{T'^{H{z),C{z))),C{z)) = accept] <p , 

where {H {z) , C {z)) is a random variable sampled from k independent tran- 

scripts T {H (z) , C (z)) . 

That is, even after a passive adversary has witnessed k identification sessions 
between H and C, he still cannot successfully masquerade as H with probability 
greater than p. A passive adversary models the eavesdropper or “shoulder-surfer” 
who is willing to watch H identify himself but does not control the communi- 
cation channel between H and C. On the other hand, an active adversary is 
permitted to control the channel between H and C, which leads to a much 
stronger definition of security. 

Definition 4. An identification protocol {H, C) is {p, fc)-secure against active 
adversaries if for all computationally hounded adversaries A, 

Pr[{A{T^{A,H{z),C{z))),C{z)) = accept] <p , 

where {A, PI (z) , C (z)) denotes a random variable sampled from k sessions 
where A is allowed to observe and make arbitrary changes to the communications 
between H and C . 

This last definition is a theoretical goal which in practice is not achieved by 
any existing solution to this problem, except for the case k = \. For example, 
most password-based protocols may be compromised in one authentication by a 
trojan horse which records the user’s password before performing (or failing to 
perform) the computational steps involved. Therefore, we will relax this condi- 
tion as follows. We will allow a third outcome for the interaction of H and C 
(and any third parties): we will allow H to reject C. This will be denoted by 
{H{-),C{-)) =T. Our relaxed security requirement is that after eavesdropping 
on k identification sessions, A still has probability at most q of interacting with 
p[ and C without being detected: 

Definition 5. An identification protocol {H, C) is (p, g, fc)-detecting against ac- 
tive adversaries if for all computationally bounded adversaries A, 

- Pr[{H{z),A{THH{z),C{z)))) ^T] < q 

— Pr[{A{T^{H{z),C{z))),C{z)) = accept] < p . 

In this setting, we deprive the adversary A of the opportunity to interfere 
with communication between H and C. For a protocol satisfying this security 
condition, P[ should consider his communications with C to be compromised 
once H rejects C, and should not respond to any further authentication requests 
until the parties may securely exchange a new secret z' . 
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We note that in the human-executable setting some parameters may be re- 
laxed when compared with computationally intensive protocols for identifica- 
tion. For example, a “standard” cryptographic goal for an identification protocol 
might be a protocol which is (2“™, 2“™, 2^°°)-detecting against active adver- 
saries. But when a human is providing the transcripts for {C {z) , H [z)) it is 

quite reasonable to expect that security for 10® authentications will be sufficient, 
since a human would take decades to provide so many. Further, many applica- 
tions which require human authentication are apparently more tolerant to false 
positives; for example, most automated teller machines have a confidence level 
of only 10“^. Thus a (10“®, 10“®, 10’^)-detecting protocol may be acceptable for 
humans. 



3 Plausible Hard Problems 

In this section we introduce two computational problems as candidates for con- 
structing secure human executable authentication protocols, along with some 
evidence that these computational problems are hard. Both problems can be 
characterized as loosely based on the sparse subset sum problem, taken over 
vectors of digits, with some twists intended to allow more authentications. 



3.1 Learning Parity in the Presence of Noise 

Suppose the secret shared between the human and the computer is a vector x 
of length n over GF{2). Authentication proceeds as follows: The computer, C, 
generates a random n-vector c over GF(2) and sends it to the human, H, as a, 
challenge. responds with the bit r = c • x, the inner product over GF(2). G 
accepts if r = c • X. Clearly on a single authentication, C accepts a legitimate 
user F[ with probability 1, and an impostor with probability iteration k times 
results in accepting an impostor with probability 2“^. Unfortunately, after ob- 
serving 0(n) challenge-response pairs between G and FI, the adversary M can 
use Gaussian elimination to discover the secret x and masquerade as H. 

Suppose we introduce a parameter 77 € (0, |) and allow F[ to respond incor- 
rectly with probability 77; in that case the adversary can no longer simply use 
Gaussian elimination to learn the secret x. This is an instance of the problem 
of learning parity with noise (LPN). In fact the problem of learning x becomes 
NP-Hard in the presence of errors; it is NP-Hard to even find an x satisfying 
more than half of the challenge-response pairs collected by M [8]. Of course, 
the hardness results of Hastad [8] simply imply that there exist instances of this 
problem which cannot be solved in polynomial time unless P=NP; it is still pos- 
sible that the problem is tractable in the random case. However, Kearns [9] has 
shown that in the random case, parity is not efficiently learnable in the statisti- 
cal query model; and all known efficient learning algorithms for noisy concepts 
can be cast in this model. Additionally, Blum et al [10] show that for the case of 
uniformly distributed challenges, weak prediction is equivalent to strong predic- 
tion - that is, any algorithm to predict the next response bit with probability 
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I + ^ can be used to recover the underlying parity function; and any algo- 
rithm which can learn LPN when the parity function is chosen uniformly can 
be used to learn arbitrary parity functions. Further, the best known algorithm 
for the general random problem, due to Blum, Kalai and Wasserman, requires 
2^(”/i°g") challenge-response pairs and works in time here we will 

give some evidence that this problem is, in fact, uniformly hard and cannot be 
solved in time and sample size poly{n, ~ v))- 

In the following, we will refer to an instance of LPN as a m x n matrix A 
(where m = poly(n)); a m- vector b, and a noise parameter rj; the problem is 
to find a n- vector x such that |Ax — b| < ijm, where |x| denotes the Hamming 
weight of the vector x. 

Lemma 1. (Pseudo-randomizability) 

Any instance of LPN can be transformed in polynomial time into an instance 
chosen uniformly at random from a space of 2" possibilities. 

Proof; Choose the n x n matrix R Gij {0, 1}" ; Then if there is a solution to 
the instance (AR,b,? 7 ), say y, then we have: 

|(AR)y - b| < 77 m , 

and if we let x := Ry we find that Ax = A(Ry) = (AR)y, which yields the 
desired x, since: 

|Ax-b| = |(AR)y-b| < rjm . 

Thus there is a polynomial-time transformation between adversarial instances 
and and a large class of random instances, such that a solution to the randomly 
chosen instance can be transformed into a solution to the adversarial instance. 
Phrased differently, each instance of LPN belongs to a space of 0(2” ) instances 
such that either all of the instances are easy or only a negligible fraction are 
easy. This is similar to the situation with discrete logarithms, where either all 
of the instances modulo a given prime are easy, or only a negligible fraction are 
easy. 

Lemma 2. (Log-Uniformity) 

If there exists an algorithm A capable of solving a \/poly{n) fraction of the 
instances {A,h,rj) of LPN in time poly{n,log{l/{^ — rj))), then with high prob- 
ability, any instance can be solved in time poly{n,\og{l/ — rj))). 

Proof; Let e{r]) = \ — r], and let A be an algorithm which solves random 
instances in time poly{n,\og{l/e{ri))). Let (A,b,? 7 ) be an adversarial instance of 
LPN. Create the new instance (A',b',77') as follows: 

— For each row of A, randomly choose n other rows of A and use the sum of 
these rows as the corresponding entry in A' 

— Fill in the corresponding entry in b' by adding the corresponding rows of b. 

— Set p' :=\- i(l-277)”+i 
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Given the error rate 77 in the initial instance, the error rate ij' is correct, by the 
following lemma (due to Blum, Kalai, and Wasserman): 

Lemma 3. Let (oi, 5i), . . . , (ug, bg) be samples from (A, b, rj); then bi + . . . + bg 
is the correct label for oi + . . . + a* with probability 5 + |(1 ~ 277 )'*. 

The proof follows by induction on s [11]. The resulting instance is distributed 
uniformly; so with probability l/poly{n), A solves it in time poly(ri, log{l / e{r]'))) . 
But note that: 

e(r 7 ') = ^(1-277')”+' 

= i(l-2(i-e(77)))"+^ 

= ^(2e(77))”+i 

so that poly(ji,log{l/e{r]'))) = poly{n,log{l/e(r]))); since the expected number 
of attempts to find an instance soluble by A is poly{n), A solves adversarial 
instances in time poly{n,log{l/e(j]))). 

Conjecture 1. (Hardness of LPN) 

LPN is uniformly hard in n and rj: there is no algorithm to solve a uniformly 
chosen instance (A,b,? 7 ) in time poly{n,l/{^ ~ v)) with non-negligible proba- 
bility. 

Evidence: 

— (LPN) is not efficiently learnable in the statistical query model; combined 
with the uniformity results of Blum et al this suggests that uniformly chosen 
inputs are hard. 

— The best known algorithm for the random case, given by Blum, Kalai, and 
Wasserman, has superpolynomial complexity. 

— Lemmas 1 and 2. 

This assumption is not unprecedented: the McEliece public-key cryptosystem 
[ 12 ] relies on a related assumption, and the pseudo-random generator proposed 
by Blum, Furst, Kearns and Lipton [10] is secure under a very similar assumption. 

In adapting this problem to use by humans, we restrict the hamming weight of 
the secret vector x to be k, where k is roughly logarithmic in n, the length of the 
challenge. Rather than taking the inner product of vectors over GF{2), challenges 
are vectors of decimal digits, and responses are the sum without carries (i.e., 
modulo 10 ) of the digits in the positions corresponding to the non-zero entries of 
X. Our best algorithm for solving instances of this related problem has complexity 
ik/ 2 )- algorithm proceeds by evaluating all possible hamming- weight k/2 
vectors on the challenges, and applying hashing to find pairs of vectors which 
sum to the correct response on roughly a fraction 1 — 77 of challenges. Note that 
while this attack is better than the brute force approach of guessing all weight-k 
vectors - which has complexity (^) - the complexity is still superpolynomial 
when k is logarithmic in n. 
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3.2 Sum of k Mins 

Let z = {{xi,yi), ( 0 : 2 , j/ 2 ), • ■ • , {xk,yk)) be a set of pairs (xi,yi) of integers mod 
n. Let V e {0, . . . , 9}", and define /(v, z) by: 

k 

f{-v,z) = ^min{v[a;i], v[y*]} mod 10 . 

i=l 

Then the sum of k mins problem is: given m pairs (vi, t6i), . . . (v^, Wm), 
where Vi G {0, . . . , 9}”, Ui G {0, . . . , 9}, and k log^Q n < m < ( 2 ) , find a set z 
such that Ui = /(vi, z) for alH = 1, . . . , m. 

An algebraic approach to this problem is to form the system of equations 
given by: 



^1,1,2 t'l.i.s • ■ • 



^2,1,2 • ■ • 




_Vm,l,2 ■ ■ ■ 





21,2 









Ml 




z: 




U2 








_'^m 




_ ^n—l,n _ 







(mod 10) , 



where Vk,ij = min{vk[t], Vk[j]}, Zij = 1 if (i,j) G z, and I < i < j < n. 
If TO > ( 2 ) we expect to solve this system uniquely by Gaussian elimination. 
When TO < ( 2 ) , on the other hand, this approach leads to a sparse subset sum 
problem. The best known algorithms for these instances have complexity roughly 
("^"fc/ 2 ^^^) (which is greater than when fc > 3). 

Another approach to the problem is a form of maximum-likelihood estima- 
tion (MLE): for some subset of the locations in z, try all possible values, while 
modeling the remaining inputs to /(-,z) as uniform random variables (an accu- 
rate model when the Vi are chosen at random) . Choose the subset of locations 
which gives the best chance of observing the output values ut. If the subset of z 
we are guessing has I locations, this algorithm has complexity (”) . However, to 
succeed in selecting correct locations, the algorithm may require many samples 
(perhaps more than ( 2 ))- 

For any distribution T>, the maximum probability of distinguishing between 
T> and the uniform distribution on the same range U is U), where 



^{AB) - PrB[e]\ , 

eeE 



i.e., the statistical distance between A and B. Thus the expected minimum num- 
ber of samples required to distinguish between T> and U is 1/A{V, U). Therefore 
calculating this distance for the distribution of the modulo 10 sum of k mins will 
help us develop lower bounds on the required sample complexity for MLE. 

To calculate the statistical distance between k mins and uniformly random 
digits, we derive an expression which will allow us to calculate the probability 
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of obtaining a digit as the sum of k mins. Let denote the probability of 
obtaining the digit d as a, sum of k mins. Then the are easily obtainable by 
enumerating all pairs of digits. For A: > 1, we note that for each d, there are 10 
ways to obtain d as the sum of k mins: for each digit d' , we obtain d' from one 
min and d — d' mod 10 from the other k — 1 mins. In other words, we can write 
the recurrence = X)o<ii'< 9 ’ which leads to the observation that 
dynamic programming is sufficient for obtaining the distribution over k mins. 
Table 1 yields the result of applying this procedure to calculate the expected 
minimum sample complexity to distinguish between the uniform distribution on 
{0, . . . , 9} and a sum of k mins, for k < 12. 



Table 1. Distribution of sum of k mins, and expected minimum number of samples 
required to distinguish from uniform 



k 


1 


2 


3 


4 


5 


6 


7 


00 


9 


10 


11 


12 


#S 


4 


14 


44 


140 


532 


1346 


4154 


12848| 


39696 


122682 


379100 


1171498 



Note that the essential meaning of this table is that without guessing more 
than 12 locations from a challenge, an adversary cannot expect to use statisti- 
cal procedures to learn a sum of 12 mins password with fewer than 1,171,498 
challenge-response pairs. In general, we can protect against this attack by choos- 
ing k such that the number of required samples is greater than ( 2 ), since ( 2 ) 
samples are sufficient for Gaussian elimination. 

4 Security against Passive Adversaries 

In this section, we will give a protocol which is (p, fc)-secure against passive 
adversaries but not against arbitrary adversaries. We also give some empirical 
evidence that it is (0.9, 0.25, 160)-human executable. Intuitively, C generates the 
coefficient matrix of some LPN instance while P[ generates the output vector and 
some errors. Thus after a number of repetitions C can be reasonably sure that 
H knows the shared secret vector x. 

Protocol 1 

Shared Secrets: P[ and C share a secret 0-1 vector x with |x| = k. 

Authentication: 

(Cl) C sets i := 0 
~ Repeat m times: 

(C 2 ) C selects a random challenge c G/j {0, 1}” and sends it to PI 
(HI) With probability l — rj,P[ responds with r := c-x, otherwise P[ responds 
with r := 1 — c • X. 

(C3) if r = c • X, C increments i. 

(C4) if i > (1 — r])m, C accepts P[. 
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Theorem 1. If H guesses random responses r, C will accept H with probability 
at most 




where cq > ^ is a constant depending only on rj. 

Proof: Let X be the random variable denoting the number of times H guesses 
correctly; since this probability is at most | , the probability of guessing correctly 
exactly i times out of m is (™) (|)™; the first result follows from summing the 
probabilities of guessing correctly (1 — rfm or more times; the second result 
follows by a Chernoff bound with cq = (3 — 2r]Y ftS > (3 — 1)^/6 = 

Theorem 2. If LPN is hard, then Protocol 1 is , poly{n)) - secure against 

a passive adversary. 

Proof: Obvious. Since a passive adversary can only observe challenge-response 
pairs (c, r), obtaining the secret x can only be accomplished via solving the LPN 
problem. 

Unfortunately, as previously mentioned, this protocol is not secure against an 
active adversary: suppose M can insert arbitrary challenges into the interaction; 
then M can record n/m(l — 77)^ successful authentications and replay them back 
to H, discarding (c, r) pairs which do not match; the remaining pairs will have 
no errors and can be solved by Gaussian elimination. Additionally, this protocol 
must be iterated many times in order to achieve any sort of security. 

As an additional consideration for the human user, the challenges c could 
be selected from {0, . . . , 9}" and the arithmetic done modulo 10, a natural base 
for many humans. This reduces the number of iterations necessary for a given 
security level by a constant factor. It also requires modifying the method of 
making an error: in cases when an error is to be made, the response should be 
chosen uniformly from {0, . . . , 9}. 

Note that assuming the best known attack complexity is optimal, we can 
choose parameters which will provide ample security in this setting. For example, 
when n = 1000 and k = 19, the best known attack’s complexity of ([fe/2]) 
roughly 2^®. This compares favorably with common minimum strength guidelines 
for choosing cryptographic parameters. 

To assess the property of human executability, we conducted the following 
experiment. A computer implementing this authentication system with m = 7, 
rj = i, n = 200 and fc = 15 was attached to a Coke machine in our department’s 
lounge. The system was also implemented as a web page, which provided a tu- 
torial in its use. Students and faculty were permitted to access the web page 
as often as they wished, and a free Coke was given to anyone who could suc- 
cessfully authenticate himself to the computer attached to the coke machine. 
In a one week period, 54 users attempted 195 authentications and successfully 
completed 155. The average time per successful authentication was 166 seconds, 
and the average time per unsuccessful authentication was 171 seconds. Thus it 
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is empirically clear that there is some value a for which this is a (a, .25, 160)- 
human executable identification protocol which is secure against computationally 
bounded eavesdropping adversaries. 

5 Security against Arbitrary Adversaries 

The protocol of the previous section is quite insecure against an adversary who 
is capable of modifying the communications between iJ and C. For example, by 
simply replaying the same challenge r back to H several times, A can compute 
the true value of r • x and thus after collecting n such error-free values, can learn 
the secret x by Gaussian elimination. Even if we simply replace weight-fc LPN 
by sum of \k/2'\ mins, the problem persists. That is, while simply replaying the 
same challenge to H will no longer allow A to learn the secret z, replaying the 
same challenge with a slight change — for example, changing a single ’9’ to a ’0’ 
— will still allow A to learn the secret z with 0{n) well-chosen challenges. 

Thus we seek to make it difficult for A to submit arbitrary challenges to H 
in place of those sent by C. To do so, we will introduce two mechanisms. First, 
Error- Correcting Challenges have the property that it with very high probabil- 
ity a challenge cannot be modified in a small number of locations. Second, we 
require the challenges to satisfy some concept which is hard to learn without 
membership queries, such as satisfying /(r, z) = 0 mod 10 for an independent 
fc-mins password 2 ;. 



5.1 Error-Correcting Challenges 

Blum et al. [13] show how a function which is linear with probability 1 — 5 can be 
self-corrected to a linear function which matches the given function with proba- 
bility 1 — 25. Self-correction of this form is used in many Probabilistically Check- 
able Proof (PCP) arguments. We propose that a similar error-detecting/self- 
correcting approach can be applied to the challenges in our system, resulting in 
a system which has the property that with high probability an adversary cannot 
make local changes to a challenge. 

The protocol proposed in this document will use the self-correction algorithm 
of [13] to achieve this goal. A legitimate challenge will consist of ru x /i 10 x 10 
squares of digits, or n = lOOwh digits. Each square will be generated by choosing 
3 digits (a,b,c) uniformly at random; then the digit at location x,y will have 
the value L(x, y) = ax -I- by -I- c mod 10. Linearity can be tested by choosing a 
random point x (mod 10) and random offset mod 10, r, and testing whether 
L(x) = L(x-l-r) — L(r)-l-L(0). If a challenge square passes this test several times 
then we say that it is close to linear, and in the subsequent phase we will access 
the value of a location x by accessing its “self-corrected” value at the randomly 
chosen offset r, which is given by L{x -|- r) — L{r) -\- L(0). Thus if we reject a 
challenge which contains a highly non-linear square and self-correct otherwise, 
with high probability an adversary will be unable to effect a local change to a 
challenge. 
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5.2 The Protocol 

Coupled with a deterministic response protocol to prevent replay attacks, we 

obtain the protocol outlined below. 

Protocol 2 

Shared Secrets: H and C share two sum of k mins secrets pi and p 2 , and a 

secret digit d. As in Section 3, we denote by f{c,pi) the result of taking the sum 

of the self-corrected min of each pair in pi for the challenge c. 

Authentication: Repeat m times for confidence 10“™: 

(Cl) Uniformly pick wh sets of parameters (a^, bi, Ci) and form the error-correcting 
challenge for these parameters, c = ECC{a, b, c). If f{c,pi) yf d, repeat until 
the condition holds. Send the resulting challenge to H: 

C ^ H : c = ECC{a,b,c) . 

(HI) Test each square for linearity. Reject if any square is not close to linear. 
(Report a network infiltration to system administrator and choose a new 
password) 

(H 2 ) Check that f{c,pi) = d. If not, reject and report a network infiltration to 
system administrator. 

(H3) Respond with the self-corrected sum of mins for the password p 2 '■ 

H : r = /(c,p2) ■ 

(C 2 ) Reject if r yf f{c,p 2 )- 



C accepts H if it has not rejected after m rounds. 

Intuitively, we use self-correction on error-correcting challenges to make it 
infeasible for an adversary to make local changes to a challenge. Thus, to make 
a membership query, the adversary must make global changes to the challenge, 
yet since f{-,Pi) is distributed essentially uniformly any global change will be 
caught with probability at least 0.9. 

Thus heuristically, we have a protocol which is (0.1, 0.1, ( 2 ))-detecting against 
computationally bounded adversaries. With the challenge size n = 900, fc = 12 
and TO = 6, the best known attack on sum of k mins given fewer than ( 2 ) samples 
has complexity greater than (®™)> which is roughly 2®®. Thus the security of the 
system appears to be quite high. 

It seems reasonable that a human can learn to do linearity testing on sight, 
since error-correcting challenges form distinctive patterns of digits; thus the hu- 
man computational load in this protocol may be as low as 96 base 10 sums and 
24 mins to compute the response to a single challenge. For confidence 10“®, 
this translates to a protocol which requires a minimum of 576 base 10 sums plus 
considerable search effort. Therefore, while this protocol offers a great deal of se- 
curity against arbitrary computationally bounded adversaries, it seems unlikely 
to be of practical significance on its own. 
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6 Some Inherent Limitations 



The approach to human-executable primitives taken here has some inherent lim- 
itations which may, unfortunately, make it difficult to improve on these protocols 
without a new approach. We now consider a large class of “similar” protocols in 
which the shared secret is a set of k of relevant locations in a n-digit challenge. 
We will show that, assuming the “meet in the middle” attack of time complex- 
ity 0 ((^" 2 )) is optimal, there is no significantly harder function in this class, 
computationally speaking, than parity with noise. 

We model the human as a finite automaton which sequentially processes k 
inputs by transitions between states in the range {!,..., Q} and which gives 
an output in the range {1, . . . , d}. Since humans have highly bounded memory, 
this model seems fitting for human computation in this application. We assume 
that the transition table for this automaton may change between inputs but is 
publicly known. 

Now consider an attack which uses m challenge-response pairs and processes 
all sequences of locations of length k/2. For each location sequence, the automa- 
ton is run forward k/2 steps, producing a string in the range {!,..., d}™. This 
string and its location are inserted in a hash table with Q™ spaces. Also, for each 
sequence of k/2 locations, for each challenge, the automaton is started from each 
intermediate state {!,..., Q} and the list of intermediate states which produced 
the correct response is retained. For each challenge, the expected number of 
intermediate states retained will be Q/d. Thus we will expect approximately 
(Q/d)"^ sequences of intermediate states to match the correct responses for each 
sequence of k/2 locations; each of these intermediate state sequences can be in- 
serted into the same hash table of size Q'". Any match in the hash table between 
a “first-half” sequence and a “second-half” sequence suggests a length k sequence 
of locations which matches on the m challenge-response pairs under considera- 
tion; such a sequence can be tested against the 0{klog^n) challenge-response 
pairs required to uniquely determine the secret k locations. 

Now we assess the total computational work factor for this attack. First, for 
each sequence of k/2 locations, {Q/d)^ length-m sequences must be inserted into 
the hash table, for a total of 0((Q/d)™n^/^) work. Also, each collision between 
a “first-half” sequence and a “second-half” sequence will require some work to 
check against the full set of challenge-response pairs. For an appropriate family 
of universal hash functions, the expected number of collisions will be 

X (Q/d)™n'=/2 

Q™ “ d™ ■ 

Choosing m to minimize the sum {Q / d)'^n^^‘^ + ^ results in the choice 



log 






logd 



m = 



log(Q/ d) 

logQ 



and gives the total work factor 



2‘iogQ)) _ 
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Thus if d and Q are close, or equal as in our protocols, an attacker can always 
break such a protocol by guessing only about half of the shared secret. On the 
other hand, decreasing d relative to Q increases the number of challenges a user 
must respond to for a given confidence level, while increasing Q adds to the 
cognitive load on the human. Thus while some incremental improvement in the 
computational security of our protocols may be possible, overall our choice of 
primitives represent a close to optimal tradeoff between computational difficulty 
and human cognitive load for this class of protocols. 



7 Conclusions 

We believe that the search for protocols providing secure, reusable authentica- 
tion to unaided humans is an interesting and important pursuit for the crypto- 
graphic community. In this paper, we have shown that no current solutions to 
this problem exist. We have provided definitions that we believe are reasonable 
goals for such protocols, and we have given protocols which achieve the security 
conditions attached to these goals. While we do not argue that the protocols 
we present are practical solutions to this problem - executing the protocols and 
remembering the secrets seem too hard - we believe that they are surprisingly 
close to practical while offering a good deal of security. Thus we believe that 
they suggest that more practical solutions may exist, which can match or even 
exceed their security conditions. We invite the reader to surpass them. 
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Abstract. The Advanced Encryption Standard (AES) provides three 
levels of security: 128, 192, and 256 bits. Given a desired level of security 
for the AES, this paper discusses matching public key sizes for RSA and 
the ElGamal family of protocols. For the latter both traditional multi- 
plicative groups of finite fields and elliptic curve groups are considered. 
The practicality of the resulting systems is commented upon. Despite 
the conclusions, this paper should not be interpreted as an endorsement 
of any particular public key system in favor of any other. 



1 Introduction 

The forthcoming introduction [12] of AES-128, AES-192, and AES-256 creates an 
interesting new problem. In theory, AES-128 provides a very high level of security 
that is without doubt good enough for any type of commercial application. Levels 
of security higher than AES-128, and certainly those higher than AES-192, are 
beyond anything required by ordinary applications. Suppose, nevertheless, that 
one is not satisfied with the level of security provided by AES-128 and insists 
on using AES-192 or AES-256. This paper considers the question what key sizes 
of corresponding security one should then be using for the following public key 
cryptosystems: 

— RSA and RSA multiprime (RSA-MP; the earliest reference is [14]). 

— Difhe-Hellman and ElGamal-like systems [10,15] based on the discrete loga- 
rithm problem in prime order subgroups of 

• multiplicative groups of prime fields. 

• multiplicative groups of extension fields: fields of fixed small character- 
istic and compressed representation methods (LUC [17] and XTR [8]). 

• groups of elliptic curves over prime fields (ECC, [1]). 

These are the most popular systems and the only ones that are widely accepted. 
Systems that have recently been introduced and that are still under scrutiny 
are not included, with the exception of XTR - it is included because this paper 
sheds new light on its alleged performance equivalence to ECC. Also discussed 
are performance issues related to the usage of keys of the resulting sizes. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 67-86, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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The introduction of the AES will soon bring along the introduction of crypto- 
graphic hash functions of matching security levels [13], namely SHA-256, SHA- 
384, and SHA-512. Because many common subgroup based cryptographic pro- 
tocols use subgroup orders and hashes of the same sizes, the decision what sub- 
group size to use with AES-£ becomes easy: use subgroups of prime order q with 
[log 2 g] = For ECC that settles the issue, from a practical point of view at 
least. This is reflected in the revised standard FIPS 186-2 [11]. For the other sub- 
group systems the finite field size remains to be decided upon. It may be assumed 
that both for properly chosen finite fields and for ECC the resulting subgroup 
operation is slower than a single application of the AES or SHA. It follows that, 
with respect to the familiar exhaustive search, collision, and square-root attacks 
against AES-t', SHA-2^, and properly chosen subgroups, respectively, the weakest 
links will be the AES and the SHA, not the subgroup based system. 

It may be argued that the question addressed in this paper is of academic 
interest only. Indeed, it remains to be seen if the security obtained by actual 
realization and application of ‘unbelievably secure’ systems such as AES-192, 
AES-256, or matching public key systems, will live up to the intended theoretical 
bounds. That issue is beyond the scope of this article. Even under the far-fetched 
assumption that implementations are perfect, it is conceivable that the actual 
security achieved by the AES is less than the intended one. Thus, even though 
one may be happy with the (intended) security provided by AES-128, one may 
cautiously decide to use AES-256 and match it with a public key system of ‘only’ 
128-bit security [21]. Therefore, and to give the theme of this paper somewhat 
wider applicability, not only public key sizes matching AES- 192 and AES-256 
are presented, but also the possibly more realistic sizes matching DES, 2K3DES, 
3K3DES, and AES-128. Here iK3DES refers to triple DES with i keys. 

This paper is organized as follows. Issues concerning security levels of the 
cryptosystems under consideration are discussed in Section 2. RSA moduli sizes 
of security equivalent to the symmetric systems, now and in the not too distant 
future, are presented in Section 3. The security of RSA-MP, i.e., the minimal fac- 
tor size of (matching) RSA moduli, is discussed in Section 4. Section 5 discusses 
matching finite field sizes for a variety of finite fields as applied in systems based 
on subgroups of multiplicative groups (i.e., not ECC): prime fields, extension 
fields with constant extension degree, and fields with constant (small) charac- 
teristic. Section 6 discusses various performance related issues, such as total key 
lengths and relative runtimes of cryptographic operations. A summary of the 
findings is presented in Section 7. 

2 Security Levels 

2.1 Breaking Cryptosystems 

Throughout this paper breaking a symmetric cryptosystem means retrieving the 
symmetric key. Breaking RSA means factoring the public modulus, and breaking 
a subgroup based public key system means computing the discrete logarithm of 
a public subgroup element with respect to a known generator. Attacks based 




Unbelievable Security 



69 



on protocol specific properties or the size of public or secret exponents are not 
considered. Thus, this paper lives in an idealized world where only key search 
and number theoretic attacks count. For any real life situation this is a gross 
oversimplification. But real life security cannot be obtained without resistance 
against these basic attacks. 



2.2 Equivalence of Security 

Under the above attack model, two cryptosystems provide the same level of 
security if the expected effort to break either system is the same. This way 
of comparing security levels sounds simpler than it is, because ‘effort’ can be 
interpreted in several ways. In [7] two possible ways are distinguished to compare 
security levels: 

— Two cryptosystems are computationally equivalent if breaking them takes, 
on average, the same computational effort. 

— Two cryptosystems are cost equivalent if acquiring the hardware to break 
them in the same expected amount of time costs the same. 

Both types of equivalence have their pros and cons. The computational effort to 
break a cryptosystem can, under certain assumptions, be estimated fairly accu- 
rately. If the assumptions are acceptable, then the outcome should be acceptable 
as well. Computational effort does not take into account that it may be possible 
to attack one systems using much simpler and cheaper hardware than required 
for the other. The notion of cost equivalence attempts to include this issue as 
well. But it is an inherently much less precise measure, because cost of hardware 
can impossibly be pinpointed. 



2.3 Symmetric Key Security Levels 

A symmetric cryptosystem provides d-bit security if breaking it requires on aver- 
age 2‘^~^ applications of the cryptosystem. Throughout this paper the following 
assumptions are made: 

1. Single DES provides 56-bit security. 

2. 2K3DES provides 95-bit security. 

3. 3K3DES provides 112-bit security [15, page 360]. 

4. AES-^ provides Abit security, for £ = 128, 192, 256. 

The single DES estimate is based on the effort spent by recent successful attacks 
on single DES, such as described in [5]. The 2K3DES estimate is based on the 
approximately 100-bit security estimate from [20] combined with the observation 
that since 1990 the price of memory has come down relative to the price of 
processors. It may thus be regarded as an estimate that is good only for cost 
equivalence purposes. However, the computationally equivalent estimate may 
not be much different. The commonly used 112-bit estimate for 3K3DES is of a 
computational nature and ignores memory costs that far exceed processor costs. 
The best realistic attack uses parallel collision search on a machine with about a 
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million terabytes of memory, and would lead to a security level of 116 bits^. This 
is more conservative than the classic meet-in-the-middle attack, which would lead 
to 128-bit (cost-equivalent) security. These comments on 2K3DES and 3K3DES 
security levels are due to Mike Wiener [21]. 

As far as the AES estimates are concerned, there is no a priori reason to 
exclude the possibility of substantial cryptanalytic progress affecting the security 
of the AES, in particular given how new the AES is. It is assumed, however, that 
if the AES estimates turn out to be wrong, then the AES will either be patched 
(cf. the replacement of SHA by SHA-1), or that it will be replaced by a new 
version of the proper and intended security levels. 

The security provided by a symmetric cryptosystem is not necessarily the 
same as its key length. The above assumptions hold only if all keys are full- 
length. Systems of intermediate strength can be obtained by fixing part of the 
keys. This possibility is not further discussed in this paper (but see Figure 1). 

It is assumed that symmetric keys are used for a limited amount of time and 
a limited encryption volume. Issues related to the limited block length of the 
DES and its variants are therefore of no concern in this paper. 



2.4 Public Key Security Levels 

Security levels of public key systems are determined by comparing them to sym- 
metric key security levels. This means that computational and cost equivalence 
have to be distinguished. 

In [7] it is argued that computational and cost equivalence are equivalent 
measures for the comparison of the security of symmetric systems and ECC. Not 
explicitly mentioned in [7], and therefore worth mentioning here, is the related 
fact that the amount of storage needed by the most efficient known attack on 
ECC (parallelized Pollard rho) does not depend on the subgroup order, but 
only on the relative cost of processors and storage [21]. In any case, if AES-128 
and a certain variant of ECC are computationally equivalent, then they may be 
considered to be cost equivalent as well. 

For the other public key systems, however, there is a gap between compu- 
tational and cost equivalence. For example, it follows from [7] that AES-128 
and about 3200-bit RSA are currently computationally equivalent. With respect 
to cost equivalence, AES-128 is currently more or less equivalent to 2650-bit 
RSA. This last estimate depends on an assumption about hardware prices and 
increases with cheaper hardware. See Section 3 for details. In Sections 3 to 5 
both types of equivalence are used to determine public key parameters that pro- 
vide security equivalent to the symmetric systems. The approach used is based 
on [7], but entirely geared towards the current application. The results from [7] 
have been criticized as being conservative [16] ~ prospective users of AES-192 
or AES-256 may be even more conservative as far as security related choices are 
concerned. The non-ECC entries of most tables consist of two numbers, referring 
to the cost and computationally equivalent figures, respectively. 

^ Each 4-fold memory reduction doubles the runtime. 




Unbelievable Security 



71 



3 RSA Modulus Sizes of Matching Security 

3.1 Current Equivalence 

Let 

L[n] = 

be the approximate asymptotic growth rate of the expected time required for a 
factoring attack against an RSA modulus n using the fastest currently known 
factoring algorithm, the number field sieve (NFS) . This runtime does not depend 
on the size of the factors of n. It depends only on the size of the number n being 
factored. 

As in [7] actual factoring runtimes are extrapolated to obtain runtime esti- 
mates for larger factoring problems. The basis for the extrapolation is the fact 
that the computational effort required to factor a 512-bit RSA modulus is about 
50 times smaller than required to break single DES. With the asymptotic run- 
time given above it follows that a fc-bit RSA modulus currently offers security 
computationally equivalent to a symmetric cryptosystem of d-bit security and 
speed comparable to single DES if 

L[2'=] « 50*2‘^-56 *L[2®i2]_ 

Furthermore, according to the estimates given in [7], a fc'-bit RSA modulus 
currently offers security cost equivalent to the same symmetric cryptosystem if 

50 * 2^-56 * L[2512] 

26 ■ 

In the latter formula P indicates the (wholesale) price of a stripped down PC 
of average performance and with reasonable memory. In [7] the default choice 
P = 100 is made. Any other price within a reasonable range of the default choice 
will have little effect on the sizes of the resulting RSA moduli. See [7, Section 
3.2.5] for a more detailed discussion of this issue. 

Unlike [7], the relatively speed of the different symmetric cryptosystems un- 
der consideration is ignored. The differences observed - comparable implementa- 
tions of 3DES may be three times slower than single DES, but the AES may be 
three times faster - are so small that they have hardly any effect on the sizes of 
the resulting RSA moduli. If desired the right hand sides of the formulas above 
may be multiplied by v if the symmetric system under consideration is per ap- 
plication V times slower than single DES (using comparable implementations). 



L[2'='j 



3.2 Expected Future Equivalence 

Improved hardware may be expected to have the same effect on the security 
of symmetric and asymmetric cryptosystems. It may therefore be assumed that 
over time the relative security of symmetric cryptosystems and RSA is affected 
only by new cryptanalytic insights that affect one system but not the other. 
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As far as cryptanalytic progress against symmetric cryptosystems is con- 
cerned, it is assumed that they are patched or replaced if a maior weakness is 
found, cf. 2.2. 

Progress in factoring, i.e., cryptanalytic progress against RSA, is common. 
The past effects of improved factoring methods closely follow a Moore-type 
law [7]. Extrapolation of this observed behavior implies the following. In year 
y > 2001 a k-hit RSA modulus may be expected to offer security computation- 
ally equivalent to a symmetric cryptosystem of d-bit security if 

T[2'=] « 50 * 2"*-5®+2(!^-2001)/3 

Cost equivalence is achieved in year y for a fc'-bit RSA modulus if 

50 * 

26 *P ’ 

with P as in 3. As in 3 effects of the symmetric cryptosystem speed are ignored, 
and P = 100 is a reasonable default choice. For y = 2001 the formulas are 
the same as in 3, even though, compared to [7], two years of factoring progress 
should have been taken into account. Such progress has not been reported in 
the literature. If progress had been obtained according to Moore’s law, its effect 
on RSA moduli sizes matching the AES would have been between one and two 
percent, which is negligible. 

3.3 Resulting RSA Modulus Sizes 

The formulas from 3 and 3.1 with P = 100 lead to the RSA modulus sizes 
in Table 1. The first (lower) number corresponds to the bit-length of a cost 
equivalent RSA modulus, the second (higher) number is the more conservative 
bit length of a computationally equivalent RSA modulus. Currently equivalent 
sizes are given in the row for year 2001, and sizes that can be expected to be 
equivalent in the years 2010, 2020, and 2030, are given in the rows for those 
years. It is assumed that factoring progress until 2030 behaves as it behaved 
since about 1970, i.e., that it follows a Moore-type law. If new factoring progress 
is found to be unlikely, the numbers given in the row for year 2001 should be 
used for all other years instead. If factoring progress is expected, but at a slower 
rate than in the past, one may for instance use the 2010 data for 2020. The 
data as presented in the table, however, and in particular the computationally 
equivalent sizes, may be interpreted as ‘conservative’. It should be understood 
that, even for the conservative choices, there is no guarantee that surprises will 
not occur. 

The numbers in Table 1 are not rounded or manipulated in any other way. 
That is left to the user, cf. [7, Remark 4.1.1]. For the 416-bit RSA modulus cost 
equivalent in 2001 to single DES, see also Table 2. As an example suppose an 
RSA modulus size has to be determined for an application that uses AES- 192 and 
that is supposed to be in operation until 2020. It follows from Table 1 that RSA 
moduli should be used of eight to nine thousand bits long. Using RSA moduli of 
only three to four thousand bits length would undermine the apparently desired 
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Table 1. Matching RSA modulus sizes. 



Year 


DES 


2K3DES 


3K3DES 


AES-128 


AES-192 


AES-256 


2001 


416 


620 


1333 


1723 


1941 2426 


2644 3224 


6897 7918 


13840 15387 


2010 


518 


747 


1532 


1955 


2189 2709 


2942 3560 


7426 8493 


14645 16246 


2020 


647 


906 


1773 


2233 


2487 3046 


3296 3956 


8042 9160 


15574 17235 


2030 


793 


1084 


2035 


2534 


2807 3408 


3675 4379 


8689 9860 


16538 18260 



security level (namely, higher than AES-128). Five to seven thousand bit RSA 
moduli would make the public system stronger than AES-128, as desired, but 
would also make RSA the weakest link if AES- 192 lives up to the expectations. 

4 RSA Factor Sizes of Matching Security 

Let 

E[n,p] = (log 2 n) 2 eV 2 i°spiogiogp 

be the approximate asymptotic growth rate of the expected time required by 
the elliptic curve method (ECM) to find a factor p of a composite number n 
(assuming that such a factor exists). This runtime depends mostly on the size of 
the factor p, and only polynomially on the size of the number n being factored. 
It follows that smaller factors can be found faster. A regular RSA modulus n has 
two prime factors of about (log 2 n ) /2 bits. In that case the ECM can in general 
be expected to be slower than the NFS, so the ECM runtime does not have to 
be taken into account in Section 3. In RSA-MP the RSA modulus has more than 
two prime factors. This implies that the factors should be chosen in such a way 
that they cannot be found faster using the ECM than using the NFS. In this 
section it is analysed how many factors an RSA-MP modulus may have so that 
the overall security is not affected. It is assumed that the modulus size is chosen 
according to Table 1, so that the moduli offer security equivalent to the selected 
symmetric cryptosystem with respect to NFS attacks. It is also assumed that all 
factors have approximately the same size. 

From the definitions of L\n] and E[n,p] it follows that, roughly, the factors 
p of an RSA-MP modulus n should grow proportionally to 

j^(logn)“3 _ 

The size log 2 P should therefore grow as (log 2 n)^/^, and an RSA-MP modulus n 
may, asymptotically, have approximately 0((log2 n)^/^) factors. Such asymptotic 
results are, however, of hardly any interest for this paper. 

Instead, given an RSA modulus (chosen according to Table 1) an explicit 
bound is needed for the number of factors that may be allowed. To derive such a 
bound the approach from [7] cited in 2.3 is used of extrapolating actual runtimes 
to derive expected runtimes for larger problem instances. The basis for the ex- 
trapolation is the observation that finding a 167-bit factor of a 768-bit number 
can be expected to require an about 80 times smaller computational effort than 
breaking single DES ([7, Section 5.9] and [22]). Let n' be an RSA modulus that 
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offers security (computationally or cost) equivalent to a symmetric cryptosystem 
of d-bit security and speed comparable to single DES (i.e., n' is chosen accord- 
ing to Table 1). An RSA-MP modulus n with smallest prime factor p and with 
log n « log n' offers security equivalent to the same symmetric cryptosystem if 

E[n,p\ > 80 * 

Here it is assumed that it is reasonable not to expect substantial improvements 
of the ECM, and that for application of the ECM itself computational and cost 
equivalence are the same [16]. Given the least p satisfying the above formula, 
the recommended number of factors of an RSA-MP modulus n equals m = 
[log n/ log p]. The resulting numbers of factors are given in Table 2, along with 
the bit lengths [(log 2 n)/m] of the factors, with the computationally equivalent 
result below the cost equivalent one. Note that log 2 P < [(log 2 n)/m]. For single 
DES and a cost equivalent RSA modulus in 2001 this approach would lead to a 
single 416-bit factor, since factoring a composite 416-bit RSA modulus using the 
ECM can be expected to be easier than breaking single DES. For that reason, 
that entry is replaced by ‘two 217-bit factors’. 



Table 2. Number of factors and factor size for matching RSA-MP moduli. 



Year 


DES 


2K3DES 


3K3DES 


AES-128 


AES-192 


AES-256 


2001 


2 : 217 


2 : 667 


2 : 971 


3 


882 


4 : 1725 


4 : 3460 


2 : 310 


3 : 575 


3 : 809 


3 


1075 


4 : 1980 


5 : 3078 


2010 


2 : 259 


3 : 511 


3 : 730 


3 


981 


4 : 1857 


5 : 2929 


3 : 249 


4 : 489 


4 : 678 


4 


890 


5 : 1699 


5 : 3250 


2020 


3 : 216 


3 : 591 


3 : 829 


4 


824 


4 : 2011 


5 : 3115 


4 : 227 


4 : 559 


4 : 762 


4 


989 


5 : 1832 


6 : 2873 


2030 


3 : 265 


4 : 509 


4 : 702 


4 


919 


5 : 1738 


5 : 3308 


5 : 217 


5 : 507 


5 : 682 


5 


876 


5 : 1972 


6 : 3044 



It can be seen that for a fixed symmetric cryptosystem the number of factors 
allowed in RSA-MP increases over time. This is mostly due to the fact that the 
growing moduli sizes ‘allow’ more primes of the same size, and to a much smaller 
degree due to the fact that larger moduli make application of the ECM slower. 

Almost the same numbers as in Table 2 are obtained if the factor 80 is 
replaced by any other number in the range [80/5, 80 * 5]. Uncertainty about the 
precise expected behavior of the ECM is therefore not important, as long as the 
estimate is in an acceptable range. 

It may be argued that E[n,p] should include a factor logp. It would make 
finding larger factors harder compared to the definition used above, and thus 
would lead to more factors per RSA-MP modulus. For Table 2 it hardly matters. 
Similarly, the factor (log 2 n)^ in E[n,p] may be replaced by (log 2 n )'°®2 3 (or 
something even smaller) if faster multiplication techniques such as Karatsuba 
(or an even faster method) are used. The effect of these changes on Table 2 is 
small: for computational equivalence to 2K3DES in 2010 and for cost equivalence 
to AES-128 in 2020 it would result in three instead of four factors. 
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4.1 Remark 

Although strictly speaking besides the scope of this paper, Table 3 gives the num- 
ber of factors that may be allowed in RSA-MP moduli of bit lengths 1024, 2048, 
4096, and 8192 with the cost equivalent number followed by the computationally 
equivalent one. It follows, for example, that in the conservative computationally 
equivalent model one would currently allow three factors in a 1024-bit RSA-MP 
modulus. But, using less conservative cost equivalence one would, more conser- 
vatively, allow only two factors in a 1024-bit modulus (see also Figure 1). This is 
consistent with the fact that for cost equivalence 1024-bit moduli are considered 
to be more secure than for computational equivalence: currently just 74 bits for 
the latter but 85 bits for the former. 

Table 3. Number of factors for RSA-MP popular modulus sizes. 



Year 


1 1024 


2048 


4096 


8192 


2001 


2 


3 


3 


3 


3 


4 


4 


4 


2010 


2 


3 


3 


4 


3 


4 


4 


5 


2020 


3 


4 


3 


4 


4 


4 


4 


5 


2030 


3 


5 


4 


5 


4 


5 


5 


5 



5 Finite Field Sizes of Matching Security 

In this section subgroups refer to prime order subgroups of multiplicative groups 
of finite fields. Public key systems based on the use of subgroups can either be 
broken by directly attacking the subgroup or by attacking the finite field. 

As mentioned in Section 1 the subgroup size will in practice be determined by 
the hash size. The latter follows immediately from the symmetric cryptosystem 
choice if the AES is used. Because the subgroup order is prime, the subgroup 
offers security equivalent to the symmetric cryptosystem as far as direct subgroup 
attacks are concerned. It remains to select the finite field in such a way that it 
provides equivalent security as well. That is the subject of this section. 

5.1 Fixed Degree Extension Fields 

Let p be a prime number and let A: > 0 be a fixed small integer. The approximate 
asymptotic growth rate of the expected time to compute discrete logarithms in 
F*fc is L\p^], where L is as in 3. An RSA modulus n and a finite field Fpk 
therefore offer about the same level of security if n and are of the same order 
of magnitude (disregarding the possibility of subgroup attacks in F**,). It is 
generally accepted that for such n, p, and k factoring n is somewhat easier than 
computing discrete logarithms in F*^, . For the present purposes the distinction is 
negligible. Furthermore, it is reasonable to assume the same rate of cryptanalytic 
progress for factoring and computing discrete logarithms. It follows that Table 1 
can be used to obtain matching fixed degree extension field sizes: to find log 2 P 
divide the numbers given in Table 1 by the fixed extension degree k. 
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5.2 Prime Fields 

It follows from 5.1 that if prime fields are used (i.e., k = 1), then conservative 
field sizes (i.e., [log 2 p]) are given by the numbers in Table 1. 

As an example suppose a subgroup and prime field size have to be determined 
for an application that uses AES-256 and that is supposed to be in operation 
until 2010. Since SHA-512 will be used in combination with AES-256, the most 
practical subgroup order is a 512-bit prime. Furthermore, it follows from Table 1 
that the prime determining the prime field should be about fifteen thousand bits 
long. Using eight thousand bits or less would undermine the apparently desired 
security level (namely, higher than AES-192). A nine to fourteen thousand bit 
prime would make the public system stronger than AES-192, as desired, but 
would also make the prime field discrete logarithm the weakest link. 

5.3 Extension Fields of Degrees 2 and 6 

LUC and XTR reduce the representation size of subgroup elements by using 
their trace over a certain subfield so that the representation belongs to the 
subfield as well. This does not affect the security and increases the computational 
efficiency [8,17]. 

LUC. LUC uses a subgroup of F*a of order dividing p+ 1 and traces over Fp. It 
follows from 5.1 that the size of the prime field Fp can be found by dividing the 
numbers from Table 1 by A: = 2. Table 4 contains the resulting values of [log 2 pj. 
XTR. XTR uses a subgroup of F*e of order dividing — p + 1 and traces 
over Fp 2 . The size of the underlying prime field Fp can be found by dividing the 
numbers from Table 1 by fc = 6, resulting in the [log 2 pj- values in Table 5. 

5.4 Remark 

For many of the LUC and XTR key sizes in Tables 4 and 5 there is an integer 
e > 1 such that (log 2 p)/e > log 2 q. This implies that the fields Fp in LUC and 
Fp 2 in XTR can be replaced by Fpe (LUC) and Fp 2 e (XTR), where log 2 P*^ « 
log 2 P (see [8, Section 6]). Because as a result log 2 P > log 2 q, proper p and q can 
still be found efficiently, in ways similar to the ones suggested in [8]. In XTR 
care must taken that q and p are chosen so that g is a prime divisor of (j>ee{P): 
the 6e-th cyclotomic polynomial evaluated at p, which divides — p® -I- 1. In 
LUC q must divide <f> 2 e{p), a divisor of p® -I- 1. With a proper choice of minimal 
polynomial for the representation of the elements of Fpe (LUC) or Fp 2 e (XTR), 
this leads to smaller public keys and potentially a substantial speedup (also of 
the parameter selection). The numbers in Section 6 do not take this possibility 
into account. 

5.5 Small Characteristic Fields 

Let p be a small fixed prime (such as 2), and let fc > 0 be an extension degree. The 
approximate asymptotic growth rate of the time to compute discrete logarithms 
in F*s, for small fixed p is 

gC(logp'“)i/=*(log logp'“)2/3 
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Table 4. [logjp] for matching LUC prime fields. 



Year 


DES 


2K3DES 


3K3DES 


AES-128 


AES-192 


AES-256 


2001 


208 310 


667 862 


971 1213 


1322 1612 


3449 3959 


6920 7694 


2010 


259 374 


766 978 


1095 1355 


1471 1780 


3713 4247 


7323 8123 


2020 


324 453 


887 1117 


1244 1523 


1648 1978 


4021 4580 


7787 8618 


2030 


397 542 


1018 1267 


1404 1704 


1838 2190 


4345 4930 


8269 9130 



Table 5. [logjp] for matching XTR prime fields. 



Year 


DES 


2K3DES 


3K3DES 


AES-128 


AES-192 


AES-256 


2001 


70 104 


223 288 


324 405 


441 538 


1150 1320 


2307 2565 


2010 


87 125 


256 326 


365 452 


491 594 


1238 1416 


2441 2708 


2020 


108 151 


296 373 


415 508 


550 660 


1341 1527 


2596 2873 


2030 


133 181 


340 423 


468 568 


613 730 


1449 1644 


2757 3044 



Table 6. [logjP*^] for matching small characteristic fields. 



Year 


DES 


2K3DES 


3K3DES 


AES- 128 


AES-192 


AES-256 


2001 


455 732 


1767 2357 


2690 3440 


3781 4695 


10637 12318 


22210 24823 


2010 


592 912 


2066 2711 


3073 3883 


4249 5227 


11508 13269 


23570 26277 


2020 


770 1140 


2432 3140 


3535 4414 


4809 5861 


12524 14377 


25139 27954 


2030 


977 1398 


2835 3608 


4037 4986 


5412 6539 


13594 15539 


26771 29694 



for c oscillating in the interval [1.526, 1.588] (cf. [3]). Since the smallest c leads 
to the more conservative field sizes, let 

This function is similar to L as defined in 3, but has a smaller constant in 
the exponent. This has serious implications for the choice of the field size 
for small fixed p, compared to the case where k is fixed (as in 5.1). Computing 
discrete logarithms in F2607 requires an about 25 times smaller computational 
effort than breaking single DES [19]. It follows that a small fixed characteristic 
field Fpfc currently offers security computationally equivalent to a symmetric 
cryptosystem of d-bit security and speed comparable to single DES if 

T'[/] « 25*2'^-^® [2®°^]. 

With respect to cost equivalence and expected future equivalence the same ap- 
proach as in 3 and 3.1 is used: divide the right hand side by 26 * P for cost 
equivalence, and multiply it by for future equivalence. The resulting 

values of [log2 p^] for small characteristic fields are given in Table 6 (for P = 100); 
for p = 2 the numbers indicate the recommended value for k. Historically, sub- 
groups of multiplicative groups of characteristic two finite fields were mostly of 
interest because of their computational advantages. Comparing the numbers in 
Table 1 and Table 6, however, it is questionable if the computational advantages 
outweigh the disadvantage of the relatively large field size. 
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Fig. 1. The sizes from Tables 1 and 6 and the numbers of factors from Table 2 for the 
year 2010. The shaded areas are bounded from above by the computationally equivalent 
curves and step function and from below by the cost equivalent ones. 



6 Performance Issues 

Assume that public key sizes are chosen according to Tables 1 to 6 to match a 
symmetric cryptosystem of d-bit security. In this section the impact on public 
key size overhead and computational requirements is discussed. 



6.1 Public Key Sizes 

In Table 7 public key sizes are given for three scenarios. The regular public key 
refers to all bits contained in the public key. In an ID-based set-up the public 
key is reconstructed based on the user’s identity and an additional number of 
overhead bits. Refer to [6] for ID-based public key compression for RSA. For 
subgroup based systems ID-based methods can trivially be designed in almost 
any number of ways. In a shared public key environment users share a large part 
of the public key data. In that case only the part that is unique for each user 
has to be counted. 

For subgroup based systems the public key consists of a description of the 
subgroup, the generator g, its prime order q, and the public point h = (or its 
trace), where s is the secret key. The generator itself can usually be derived at 
the cost of an exponentiation of an element with a small representation, and is 
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not counted. In an ID-based system the description of the subgroup and q can be 
reconstructed from the user’s identity and, say, 64 additional bits, which leads 
to a total public key overhead of 64 bits plus the bits required to describe h. In 
a shared environment all users use the same g and q, so ft. is the only part of the 
public key that is unique for each user. 

Fixed degree extension fields are not considered in Table 7, because in that 
case one may as well use LUC or XTR. The choice of subgroups of multiplicative 
groups of small characteristic fields is limited. Using such subgroups therefore 
makes sense only in a context where the public key data, with the exception of 
the public point ft, are shared. 



Table 7. Number of bits required for public key data. 



PKC 


regular 


ID-based 


shared 


RSA, public exponent e 
logj n from Table 1 


log 2 e + log 2 n 


log (4 log 2 n) -b 4 log 2 ?i 


n/a 


2 log 2 p = log 2 n 
RSA-MP, public exponent e 
log 2 n from Table 1 


log 2 e -b log 2 n 


log log 2 p -b log 2 n 


n/a 


m, log 2 P from Table 2 
Fp, log 2 P from Table 1 


2 log2 p 


64 -b log2 p 


logaP 


Fpfc , small p, log 2 p*’ from Table 6 


n/a 


n/a 


log2p" 


LUC, log 2 P from Table 4 


2d -b 2 log2 p 


64 -b log2 p 


logaP 


XTR, logj p from Table 5 


2d + 3 log2 p 


64 -b 2 log2 p 


2 log2 p 


ECC, log2 p = 2d 


9d-b 1 


3d-b65 


2d-b 1 



For LUC and XTR the public key sizes follow from [17] and [8]. For ECC 
the description of the subgroup requires a finite field and an elliptic curve over 
the field. With d as above, the field and curve take at most 2d and 4d bits, 
respectively. About d and 2d+ 1 bits are required for the subgroup order q and 
the public point ft. This leads to 9d -b 1 bits for ordinary ECC, 3d + 65 bits for 
ID-based ECC (since the information about q must be present), and 2d+ 1 for 
shared ECC. For ECC the sizes do not depend on the year. To illustrate the 
public key size formulas, public key sizes for the year 2010 are given in Table 8, 
rounded to two significant digits. 



6.2 Communication Overhead for Subgroup Based Systems 

Each message in the Diffie-Hellman key agreement protocol consists of the rep- 
resentation of a subgroup element. The communication overhead per message is 
given in the last column of Table 7. ElGamal encryption has the same overhead 
(on top of the length of the message itself). The communication overhead of 
ElGamal-based message recovery signature schemes is equal to 2d. 
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Table 8. Number of bits of public key data. 



jPKC 


1 DES 


2K3DES 


3K3DES 


AES-128 


AES-192 


AES-256 


regular 


RSA(-MP) 


550 780 


1600 2000 


2200 2700 


3000 3600 


7500 8500 


15000 16000 


Fp 


1000 1500 


3100 3900 


4400 5400 


5900 7100 


15000 17000 


29000 32000 


LUC 


630 860 


1700 2100 


2400 2900 


3200 3800 


7800 8900 


15000 17000 


XTR 


370 490 


960 1200 


1300 1600 


1700 2000 


4100 4600 


7800 8600 


ECC 


510 


860 


1000 


1200 


1700 


2300 



ID-based 



RSA 


270 380 


770 990 


1100 1400 


1500 1800 


3700 4300 


7300 8100 


RSA-MP 


270 500 


1000 1500 


1500 2000 


2000 2700 


5600 6800 


12000 13000 


Fp 


580 810 


1600 2000 


2300 2800 


3000 3600 


7500 8600 


15000 16000 


LUC 


320 440 


830 1000 


1200 1400 


1500 1800 


3800 4300 


7400 8200 


XTR 


240 310 


580 720 


790 970 


1000 1300 


2500 2900 


4900 5500 


ECC 


230 


350 


400 


450 


640 


830 



shared 



Fp 


520 750 


1500 2000 


2200 2700 


2900 3600 


7400 8500 


15000 16000 


Fpfc , small p 


590 910 


2100 2700 


3100 3900 


4200 5200 


12000 13000 


24000 26000 


LUC 


260 370 


770 980 


1100 1400 


1500 1800 


3700 4200 


7300 8100 


XTR 


170 250 


510 650 


730 900 


980 1200 


2500 2800 


4900 5400 


ECC 


110 


190 


230 


260 


390 


510 



6.3 Computational Requirements 

In this section the relative theoretical computational requirements are estimated 
for the most common cryptographic applications of the public key cryptosystems 
discussed above: encryption, decryption, signature generation, and signature ver- 
ification. No actual runtimes are given. For software implementations the theo- 
retical estimates should give a reasonable prediction of the actual relative perfor- 
mance. For implementations using dedicated hardware, such as special-purpose 
exponentiators, all predictions concerning RSA and prime field subgroups are 
most likely too pessimistic. However, as soon as special-purpose hardware is 
available for ECC, LUC, or XTR, the relative performance numbers should again 
be closer to reality. 

For subgroup based systems common ElGamal-like schemes are used where 
decryption and signing each require a single subgroup exponentiation, encryp- 
tion requires two separate subgroup exponentiations, and signature verification 
requires the product of two subgroup exponentiations (a ‘double exponentia- 
tion’). The Diffie-Hellman key agreement protocol has, per party, the same cost 
as encryption, i.e., two separate subgroup exponentiations. 

It is assumed that squaring and multiplication in the finite field Fp and the 
ring Z /nZ of integers modulo n take the same amount of time if log 2 p ~ log 2 n. 
A squaring in 7i/n7i is assumed to take 80% of the time of a multiplication 
in Z/nZ. Basic exponentiation methods are used, i.e., no window tricks. This 
hardly affects the relative performance. Precomputation of the value g* with 
log 2 t « (log 2 q ) /2 combined with double exponentiation is used for subgroup 
based signature generation. For XTR the methods from [18] are used. The LUC 
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and ECC estimates follow from [18, Section 7]. For ECC the time to recover the 
y-coordinates of subgroup elements is not counted. 

The resulting runtime expressions for the four basic cryptographic functions 
are given in Table 9. Small characteristic fields are not included because the 



Table 9. Number of multiplications in Fp (unless noted otherwise). 



PKC matching symmetric 
system of d-bit security 


encryption 


signature 

verification 


decryption 


signature 

generation 


RSA, public exponent e 






sequential: 2.6 log 2 P 


logj n from Table 1 


1.31og2e 


in Z/nZ 






2 log 2 p = log 2 n 






2 in parallel: 1.31og2P 


RSA-MP, public exponent e 






sequential: 1.3mlog2P 


log 2 n from Table 1 


1.31og2 e 


in Z/nZ 






m, \ 0 g 2 P from Table 2 






m in parallel: 1.31og2P 


Fp , log 2 p from Table 1 


5. 2d 


3. Id 


2.6d 


1.6d 


LUC, log 2 P from Table 4 


6.4d 


3.5d 


3.2d 


1.8d 


XTR, log 2 p from Table 5 


21d 


12d 


lOd 


6d 


ECC, log2P = 2d 


36d 


20d 


18d 


lOd 



relative speed of and Fp arithmetic is too platform dependent. Despite po- 
tential advantages of hardware F 2 fe-arithmetic, the large value that is required 
for k may make these fields unattractive for very high security non-ECC cryp- 
tographic applications. 

As an illustration of the data in Table 9, the relative performance of the 
cryptographic operations is given in Table 10 for the year 2010, rounded to two 
significant digits. For Table 10 the time M{L) for modular multiplication of 
L-bit integers is proportional to LF'. This corresponds to regular hardware im- 
plementations. The unit of time is the time required for a single multiplication 
in Z/nZ for a 1024-bit integer n. This arbitrary choice has no influence on the 
relative performance. For RSA and RSA-MP the sequential (‘S’) and parallel 
(‘P’) performance is given, with the number of parallel processors and the rel- 
ative parallel runtime separated by a semicolon. RSA encryption and signature 
verification for e = 3 or e = 2^^ -|- 1 goes about 20 or 3 times faster, respectively, 
than for a random 32-bit public exponent as in Table 10. 

For higher security public key systems other than ECC the finite field and 
ring sizes get so large that implementation using Karatsuba-like multiplication 
techniques should be worthwhile. In software implementations this can easily be 
realized. In Table 11 the relative performance for the year 2010 is given using 
Karatsuba-like modular multiplication. This implies that M{L) is proportional 
to L*°S 2 3^ opposed to as in Table 10. The unit of time in Table 11 is the 
time required for a single Karatsuba-like multiplication in Z/nZ for a 1024-bit 
integer n. Since this may be different from the time required for a regular 1024- 
bit modular multiplication (as in Table 10), the numbers in Tables 10 and 11 
are not comparable. 
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Table 10. Relative performance using regular arithmetic for the year 2010. 



PKC 


II 


2K3DES 


3K3DES AES- 128 


AES- 192 


AES-256 


log2n 


RSA(-MP) 


1 520 750 


1500 2000 


1 2200 2700 2900 3600 | 


7400 8500 1 


15000 16000 


log2P 



Fp 


520 750 


1500 2000 


2200 2700 


2900 3600 


7400 8500 


15000 16000 


LUC 


260 370 


770 980 


1100 1400 


1500 1800 


3700 4200 


7300 8100 


XTR 


90 130 


260 330 


370 450 


490 590 


1200 1400 


2400 2700 


ECC 


112 


190 


224 


256 


384 


512 



encryption (with log 2 e = 32 for RSA and RSA-MP) 



RSA(-MP) 


11 


22 


93 


150 


190 


290 


340 


500 


2200 


2900 


8500 


10000 


Fp 


75 


160 


1100 


1800 


2700 


4100 


5500 


8000 


53000 


69000 


270000 


340000 


LUC 


23 


48 


340 


550 


820 


1300 


1700 


2500 


16000 


21000 


84000 


100000 


XTR 


8 


17 


120 


200 


290 


450 


610 


890 


5800 


7600 


30000 


37000 


ECC 




24 




120 




190 




290 




970 




2300 



decryption 



RSA (S) 


43 


130 


1100 


2300 


3300 


6200 


7900 


14000 


130000 


190000 


970000 


1300000 


RSA (P) 


r 2 


: 22 


r 2 : 


560 


r 2 : 


1600 


r 2 


: 3900 


r 2 


: 63000 


r 2 


: 490000 




: 65 




1200 




3100 


12 


: 7000 


\2 


: 95000 


\2 


: 660000 


RSA-MP (S) 


43 


57 


500 


580 


1400 


1500 


3500 


3500 


32000 


30000 


160000 


210000 


RSA-MP (P) 


f 2 


: 22 


f3: 


170 


r 3: 


480 


f 3 


: 1200 


f 4 


: 7900 


f 5 


: 31000 


13 


: 19 


|4: 


150 


|4: 


390 




: 870 


13 


: 6100 


13 


: 43000 


Fp 


37 


77 


550 


900 


1300 


2000 


2700 


4000 


26000 


34000 


140000 


170000 


LUC 


11 


24 


170 


280 


410 


630 


840 


1200 


8100 


11000 


42000 


51000 


XTR 


4 


9 


61 


99 


150 


230 


300 


450 


2900 


3800 


15000 


18000 


ECC 




12 




59 




96 




140 




490 




1100 



signatnre generation 



RSA (S) 


43 


130 


1100 


2300 


3300 


6200 


7900 


14000 


130000 


190000 


970000 


1300000 


RSA (P) 


r 2 


: 22 


r 2 : 


560 


r 2 : 


1600 


r 2 


: 3900 


r 2 


: 63000 


r 2 


: 490000 


12 


: 65 


\2: 


1200 


\2: 


3100 


12 


: 7000 


\2 


: 95000 


\2 


: 660000 


RSA-MP (S) 


43 


57 


500 


580 


1400 


1500 


3500 


3500 


32000 


30000 


160000 


210000 


RSA-MP (P) 


r 2 


: 22 


f3: 


170 


f3: 


480 


r 3 


: 1200 


r 4 


: 7900 


r 5 


: 31000 


13 


: 19 


|4: 


150 


|4: 


390 


14 


: 870 


13 


: 6100 


13 


: 43000 


Fp 


23 


48 


340 


550 


820 


1300 


1700 


2500 


16000 


21000 


84000 


100000 


LUC 


6 


13 


93 


150 


230 


340 


460 


680 


4400 


5800 


23000 


28000 


XTR 


2 


5 


36 


58 


85 


130 


180 


260 


1700 


2200 


8700 


11000 


ECC 




7 




32 




53 




79 




270 




630 



signature verification (with log 2 e = 32 for RSA and RSA-MP) 



RSA(-MP) 


11 


22 


93 


150 


190 


290 


340 


500 


2200 


2900 


8500 


10000 


Fp 


44 


92 


660 


1100 


1600 


2400 


3300 


4800 


31000 


41000 


160000 


200000 


LUC 


13 


26 


190 


300 


450 


690 


930 


1400 


8900 


12000 


46000 


57000 


XTR 


5 


10 


71 


120 


170 


260 


350 


520 


3400 


4400 


17000 


21000 


ECC 




13 




65 




110 




160 




530 




1300 
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As an example of an application of Tables 10 and 11, suppose AES-192 is used 
in 2010 along with a cost equivalent public key system. With regular (quadratic 
growth) modular arithmetic, ECC encryption takes time equivalent to about 970 
regular multiplications modulo a 1024-bit modulus. This can be expected to be 
about twice faster than RSA encryption (with a 32-bit public exponent), and 
about six times faster than XTR encryption. But with Karatsuba-like arithmetic, 
RSA encryption takes time equivalent to about 960 Karatsuba multiplications 
modulo a 1024-bit modulus (but using a 7400-bit modulus). This can be expected 
to be about 1.5 times faster than ECC encryption, and about six times faster 
than XTR. For decryption, however, RSA is substantially slower than both ECC 
and XTR for either type of arithmetic, even if RSA-MP is used on four parallel 
processors. 



6.4 Parameter Selection 

For all public key systems except ECC, parameter selection is dominated by the 
generation of the primes defining the moduli, finite fields, and subgroup orders. 
For each L-bit prime to be generated, the generation time is proportional to 
M{L)LF‘. a more precise runtime function depends on a wide variety of imple- 
mentation choices that are not discussed here. Obviously, parameter selection 
for high security RSA, prime field, or LUC based systems will be slow compared 
to RSA-MP and, in particular, XTR. 

For systems based on a subgroup of F^ic for fixed small p public key data 
are usually shared (except for the public point h). For such systems the speed 
of parameter selection is therefore not an important issue. 

ECC parameters can be found in expected polynomial time. Nevertheless, 
even for security equivalent to 2K3DES the solution is not yet considered to 
be sufficiently practical for systems with non-shared keys. The slow growth of 
the parameter sizes implies, however, that if a satisfactory solution is found for 
current (relatively low) security levels, then the solution will most likely also work 
fast enough for very high security levels. For ECC over fields of characteristic 
two this goal is close to being achieved [4]. 



7 Summary of Findings 

Matching AES-192 or AES-256 security levels with public key systems requires 
public key sizes far beyond anything in regular use today. For instance, to match 
the security of AES-192 with RSA, it would be prudent to use moduli of about 
7000 bits. But given current resources, the overall practicality of RSA with such 
moduli is questionable. Encryption and signature verification are faster than for 
any other system if the public exponent is small, but the modulus itself may be 
prohibitively large. RSA-MP fares a little better. But even if fully parallelized it is 
still relatively unattractive. An interesting observation is that computationally 
equivalent RSA-MP moduli often allow more factors than the (smaller) cost 
equivalent ones, and may thus attain greater decryption and signature generation 
speed (at the cost of a higher level of parallelism). 
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Table 11. Relative performance using Karatsuba arithmetic for the year 2010. 



PKC 


II 


2K3DES 


3K3DES 


AES- 128 


AES-192 


AES-256 1 


logan 


RSA(-MP) 


II 520 750 


1500 2000 


1 2200 2700 1 


2900 3600 


7400 8500 1 


15000 16000 1 


logaP 



Fp 


520 750 


1500 2000 


2200 2700 


2900 3600 


7400 8500 


15000 16000 


LUC 


260 370 


770 980 


1100 1400 


1500 1800 


3700 4200 


7300 8100 


XTR 


90 130 


260 330 


370 450 


490 590 


1200 1400 


2400 2700 


ECC 


112 


190 


224 


256 


384 


512 



encryption (with log 2 e = 32 for RSA and RSA-MP) 



RSA(-MP) 


14 


25 


79 


120 


140 


190 


220 


300 


960 


1200 


2800 


3300 


Fp 


99 


180 


940 


1400 


1900 


2700 


3500 


4800 


23000 


29000 


90000 


110000 


LUC 


40 


72 


380 


560 


800 


1100 


1500 


2000 


9400 


12000 


37000 


44000 


XTR 


23 


41 


220 


320 


450 


630 


820 


1100 


5400 


6600 


21000 


25000 


ECC 




60 




240 




360 




510 




1500 




3100 



decryption 



RSA (S) 


76 


200 


1300 2400 


3200 5500 


6800 


11000 


74000 


110000 


430000 560000 


RSA (P) 


r 2 


: 38 


r 2 : 630 


r 2 : 1600 


r 2 


: 3400 


r 2 


: 37000 


r 2 : 220000 




: 99 


\ 2 : 1200 


\ 2 : 2700 


12 


: 5600 


\2 


: 53000 


\ 2 : 280000 


RSA-MP (S) 


76 


100 


660 790 


1700 1800 


3600 


3700 


25000 


25000 


100000 130000 


RSA-MP (P) 


f 2 


: 38 


r 3 : 220 


/ 3 : 560 


f 3 


: 1200 


f 4 


: 6200 


r 5 : 20000 


13 


: 34 


\ 4 : 200 


\ 4 : 460 


14 


: 930 


13 


: 4900 


\ 5 : 26000 


Fp 


49 


88 


470 690 


970 1400 


1800 


2400 


12000 


14000 


45000 53000 


LUC 


20 


36 


190 280 


400 560 


730 


980 


4700 


5800 


18000 22000 


XTR 


12 


21 


110 160 


230 320 


410 


560 


2700 


3300 


10000 12000 


ECC 




30 


120 


180 




260 




730 


1500 



signature generation 



RSA (S) 


76 


200 


1300 


2400 


3200 5500 


6800 


11000 


74000 


110000 


430000 560000 


RSA (P) 


r 2 


: 38 


r 2 : 


630 


r 2 : 1600 


r 2 


: 3400 


r 2 


: 37000 


r 2 : 220000 


\2 


: 99 




1200 


\ 2 : 2700 


12 


: 5600 


\2 


: 53000 
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The unattractive sizes of RSA moduli of high security levels is entirely due 
to the number field sieve. If it had not been invented, and the asymptotically 
slower quadratic sieve factoring algorithm would still be the fastest factoring 
algorithm, then at least until 2030 RSA moduli of 2048, 4096, and 8192 bits 
would be good matches for AES-128, AES-192, and AES-256, respectively. But, 
it could have been worse too: if the special number field sieve would apply to 
RSA moduli, then RSA moduli would have to be chosen according to Table 6 
instead of Table 1, i.e., considerably larger. 

Compared to RSA and RSA-MP, subgroups of prime fields have the same 
size problem. They are much slower for encryption and signature verification. 
Decryption and signature generation is competitive only in environments where 
RSA and RSA-MP cannot be parallelized. Furthermore, subgroups of prime 
fields are consistently outperformed by LUC and XTR. So, unless second and 
sixth degree extension fields turn out to be less secure than currently believed, 
subgroups of prime fields are not competitive. 

Similarly, LUC is consistently outperformed by XTR^. Unless a dramatic 
breakthrough occurs in the fixed degree extension field discrete logarithm prob- 
lem, XTR is a good choice if one insists on using a non-ECC subgroup public key 
system. It has the additional advantages that parameter selection is easy and 
that current special purpose RSA modular multipliers (that can handle public 
moduli up to, say, 1024 bits) may be used even for very high security applications 
(possibly using Remark 5.4). The latter is also possible for LUC (if Remark 5.4 
is used), may be possible for RSA-MP, but is out of the question for RSA or 
prime field subgroups. 

Overall, ECC suffers the smallest performance degradation when moving to 
very high security levels. Generation of ECC public keys in a non-shared set-up 
remains problematic, for all security levels. If that is not a concern, and barring 
cryptanalytic progress affecting the elliptic curve discrete logarithm problem, 
the choice is obvious. 

For current security levels, i.e., comparable to 1024-bit RSA, the choice is 
between RSA, RSA-MP, XTR, and ECC and will mostly depend on the ap- 
plication. For current higher security levels, comparable to 2048-bit RSA, the 
theoretical performance gap between ECC and the other public key systems al- 
ready becomes noticeable, with only XTR still within range of ECC. However, 
hardware accelerators are currently available for 2048-bit RSA and RSA-MP, 
but not for other security equivalent public key systems. So, for the next few 
years RSA and RSA-MP will still be the methods of choice in many practical 
circumstances where security equivalent to 2048-bit RSA is required. This may 
change radically if new types of hardware accelerators are developed. And even 
if that does not happen, it will change eventually, i.e., for higher security levels, 
because special purpose hardware cannot beat the asymptotics. 

Disclaimer. The contents of this paper are the sole responsibility of the author 
and not of his employer. The author does not accept any responsibility for the 
use of the material presented in this paper. Despite his academic involvement 

^ However, for LUC it is in general faster to test if a value is correctly formatted, i.e., 
if it is the trace of a proper subgroup element. Refer to [9] for details. 
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Abstract. Although the Miller-Rabin test is very fast in practice, there 
exist composite integers n for which this test fails for 1/4 of all bases 
coprime to n. In 1998 Grantham developed a probable prime test with 
failure probability of only 1/7710 and asymptotic running time 3 times 
that of the Miller-Rabin test. For the case that n = 1 mod 4, by S. Muller 
a test with failure rate of 1/8190 and comparable running time as for 
the Grantham test was established. Very recently, with running time 
always at most 3 Miller-Rabin tests, this was improved to 1/131040, for 
the other case, n = 3 mod 4. Unfortunately the underlying techniques 
cannot be generalized to n = 1 mod 4. Also, the main ideas for proving 
this result do not extend to n = 1 mod 4. 

Here, we explicitly deal with n = 1 mod 4 and propose a new probable 
prime test that is extremely efficient. For the first round, our test has 
average running time (4 -|- o(l)) logj n multiplications or squarings mod 
n, which is about 4 times as many as for the Miller-Rabin test. But 
the failure rate is much smaller than 1/4^^ = 1/256. Indeed, for our 
test we prove a worst case failure probability less than 1/1048350. 
Moreover, each iteration of the test runs in time equivalent to only 3 
Miller-Rabin tests. But for each iteration, the error is less than 1/131040. 

Keywords: Probable Prime Testing, Error Probability, Worst Gase 
Analysis, Quadratic-Field Based Methods, Gombined Tests 



1 Introduction 

1.1 Motivation 

Large prime numbers are essential for most cryptographic applications. Perhaps 
the most common probabilistic prime test is the Strong Fermat Test (Miller- 
Rabin Test), which consists of testing that o'* = 1, resp. = —1 mod n for 
some 0 < j < r — 1 where n — 1 = 2’’s with s odd. Although exponentiation 
modulo n can be performed extremely fast, the catch with this, as with any 
probable prime test, is the existence of pseudoprimes. This means that certain 
composite integers are identified as primes by the test. 

* Research supported by the Austrian Science Fund (FWF), FWF-Project no. P 
14472-MAT 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 87-106, 2001. 
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In a typical cryptographic scenario, some of the involved parties may be ma- 
licious. If an adversary manages to sell composites as primes, this usually com- 
promises the security of the corresponding protocol. As strong pseudoprimes can 
easily be constructed this often allows fooling a pseudoprimality testing device 
that utilizes Miller-Rabin only. As an example, strong pseudoprimes are known 
with respect to all forty-six prime bases a up to 200 [3]. While a composite 
number can be a strong pseudoprime for at most 1/4 of all bases coprime to n, 
there exist composites that actually do pass for this largest possible bound of 
the 1 /4 bases. Moreover, such numbers can efficiently be characterized and con- 
structed [9]. Although it is known that a = 2 is a witness (for the compositeness 
of n) for most odd composites, it was shown in [2] that there are infinitely many 
Carmichael numbers whose least witness is larger than (log ”. 

Also, it is conjectured [2] that there are Carmichael numbers n < x for 
which there is no base a in any given set of A log x distinct integers < x that 
proves n composite by the Miller-Rabin test. 

While Miller-Rabin works well for any average number n on input a random 
base a, due to the fact that pseudoprimes can be constructed, the cautious might 
want to minimize the chance of being sold a composite instead of a prime. 

There exist a number of deterministic algorithms for primality testing (see 
e.g., [7,10,12,20,32]), which however require rather involved theory and imple- 
mentation. The advantage with pseudoprimality testing still is, that these ap- 
proaches are a lot faster and can much more easily be realized in practice. 

The result of this paper is a new probable prime test which is considerably 
more reliable than the previous proposals, but which still is much easier to 
describe and implement than the deterministic tests. 



1.2 The Proposed Test 

The main ideas for the pseudoprimality tests [16,26,28] consist of a combined 
Miller-Rabin test by utilizing both, the original Fp-based algorithm, as well as 
the quadratic field (QF)- based analogue. An additional testing criterion in [26, 
28] is based on the underlying (Cipolla related) square root finding algorithm 
modulo primes p (Lemma 1 below). If the result is not a correct root modulo n, 
n is disclosed as composite. Otherwise, this gives an additional testing condition. 

Here, we incorporate yet another root-finding algorithm when n = 1 mod 4. 
This constitutes a counterpart to the very recent results in [28] for n = 3 mod 4 
(the easier case). Via some efficient algorithm we test for what should be a square 
root of some Q mod n, =1. This automatically constitutes a strengthened 
version of a Miller-Rabin Test. Consecutively, we test for the square root of 1 
in the quadratic extension. We show how the root finding part can be obtained 
with low cost, with simultaneously obtaining a speed-up for the evaluation of 
the QT’-part, as well as a reduction of the failure rate. 

In essence, the test for any n = 1 mod 4 runs as follows. We incorporate the 
same trivial testing conditions in our precomputation as does Grantham, [16]. 
Also, as in [16], we assume that n is not a perfect square. 
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0. (Precomputation) 

— If n is divisible by a prime up to min {_B, where 
-6 = 50000, declare n to be composite and stop. 

— If -y/n € Z declare n to be composite and stop. 

1. (Parameter Selection) 

Select randomly P G Z„ , Q in Z* such that = 1, = —1. 

2. (Square Root Part) 

— Run one of the square root finding algorithms of sect. 2.3 
for the root of Q modulo n. 

If the root finding algorithm declares n composite, stop. 

— Let a be the root of Q obtained, and let 6' <— 6/a mod n . 

3. (QF-Based Part) 

— Let a(P',l), a(P',l) be the roots of x^ — P'x+1. 

— Test, if a(6', = a(6', mod n . 

// For efficient practical realization see sect. 3.1. 
If not, n is composite and stop. 

— Compute gcd(of(6', ± 1, n) . If one of these reveals a 

proper factor of n, output the factor. Otherwise declare n 
to be a probable prime. 

The above describes the first round of the test. When being iterated, some 
of the calculations can be done more efficiently (see sect. 3.3). 

1.3 The Results of This Paper 

The main result of this paper is the following theorem. As in [16], one selfridge 
is equivalent to the time required for one round of Miller-Rabin. 

Theorem 1. A composite integer n = 1 mod 4 passes k iterations of the pro- 
posed test with worst case failure prohahility less than 1/1048350 • 1/131040^“^, 
which is approximately 1/2^^*+^. 

For k iterations, the above test has average running time 3/c + 1 selfridges. 
In detail, the result can be stated as follows. 

• For one round of the proposed test, the exact failure is less than 1/2^° + 
1/(2 • 6^) < 1/1048350 and the average running time is 4 selfridges. 

• For each additional iteration, the proposed test has worst case failure prob- 
ability 1/2^^ -1-4/6^ < 1/131040 and average running time 3 selfridges. 

The first round failure rate should be contrasted to the worst case error 
probability 1/256 of four iterations of the Miller-Rabin test. For two iterations 
of the proposed test this is 1/(1.37-10^^), opposed to 1/16384, for three iterations 
1/(1. 8 • 10^®) opposed to 1/1048576, etc. 

The estimate is based on worst case analysis and on the assumption of the 
existence of special (bad) composites. Otherwise, the result would even be better. 
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The number of pairs that pass the proposed test (so-called ‘liars’) can ex- 
plicitly be determined. This number of liars is largest for integers n of the form 
like p — l\n — 1 and p+ 1 |n -I- 1 for all primes p\n. Such special types of numbers 
must be very rare and it is not even known whether they exist at all. This shows 
the difficulty for composites to pass the test with respect to varied parameters. 
Thus, the average case error rate is expected to be much smaller (see [14,37]). 

Below, we describe one method how the underlying algorithms can easily and 
efficiently be evaluated. This is based on a naive multiply/add arithmetic and can 
easily be implemented with low effort. Alternatively, this could be achieved via 
the computation of elements in a quadratic extension field [22], the evaluation of 
second-order recurrences and Lucas chains [9,16,23,34,40], or of powers of 2 x 2 
matrices [35]. 

For modular exponentiation, many improvements to the conventional power- 
ing ladder have been designed. We hope that analogously to the many tools for 
speeding up exponentiation in the prime field, similar devices for the QF- part 
will further improve on the practicality of the proposed test. 



1.4 Related Work 

A number of probable prime tests have been proposed which are based on various 
testing functions [1,6,8,19,33]. It turns out that the methods based on different 
underlying techniques are the most reliable ones, whereas those based on one 
technique only, allow the generation of pseudoprimes, even with respect to varied 
testing parameters. From a practical viewpoint however, the suggestions based 
on third and higher-order recurrences seem to be too expensive. 

Pomerance, Selfridge, Wagstaff [33] and Baillie, Wagstaff [8] proposed a test 
based on both the Fermat test and on second-order (Lucas) sequences, which is 
very powerful. Although the underlying criteria can be evaluated extremely fast, 
no composite number is known for which this probable prime test fails. Indeed, 
nobody has yet claimed the $620 that is offered for such an example. While it is 
not known whether this test does allow any pseudoprimes at all, some heuristics 
indicate that such composites actually might exist [31]. Although the specific 
choice of the parameters makes the routine easy to describe, it might increase 
the chance of generating any pseudoprimes with respect to these parameters. 
Some related tests based on different parameters have been implemented in 
several computer-algebra systems which however turned out to be quite weak 
[30]. It is not known how reliable other parameters to this test are. Also, there 
is no quantifying measure to determine how reliable it actually is. 

Several probabilistic tests have been published, for which an explicit esti- 
mate on the worst case failure probability is known. 

— The Miller-Rabin test is usually taken as a unit measure with running time 
1 selfridge [16] and worst case failure 1/4. 

— J. Grantham [16] proposed an extremely efficient test with worst case failure 
rate 1/7710 and asymptotic running time 3 selfridges. Unfortunately, the 
practical implementation is rather involved and it seems that on average 
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4.51og2(n) multiplications (instead of the asymptotic (3 + o(l)) log 2 (n)) are 
necessary. 

— By S. Muller [26] a probable prime test for the case that n = 1 mod 4 was 
developed. The test has running time similar to the Grantham test, but 
with worst case error probability 1/8290 per round. This bound, however, 
can only be achieved for at least two iterations of the test. 

— Recently, a proposal has been made [28] for n = 3 mod 4 with failure rate 
1/131040 but only 3 selfridges running time. 

Jaeschke’s tables [17] of strong pseudoprimes show that these occur very 
frequently for n = 1 mod 4. Unfortunately, the techniques for the most effective 
test above, [28], are exclusive for the case n = 3 mod 4. As the condition 4|n+ 1 
constitutes a critical requirement for both the methods employed, as well as 
for the failure estimate, this cannot be extended to n = 1 mod 4. Our results 
will be improvements and extensions of the methods of [26]. Indeed, for integers 
n = 1 mod 4 essentially new techniques will be developed in this paper. 

Relevance to Cryptography: For cryptographic applications, it is often neces- 
sary to generate pseudoprimes which are primes except for arbitrary small error 
rate. E.g., if a probability 1/2^°° is to be achieved, one needs 

— 50 iterations of Miller-Rabin, which is 50 selfridges, 

— 8 iterations of the Grantham test, which is (asymptotically) 24 selfridges, 

— 6 iterations of the proposed test, which is only 19 selfridges. 

Due to the simple evaluation method of the proposed test via a naive powering 
ladder (sect. 3.1), we hope that this theoretical improvement will have some 
practical significance as well. 

2 The New Idea 

2.1 Some Fundamental Properties 

Unless stated otherwise, let p,pi be an odd prime, respectively an odd prime 
divisor of an integer n = 1 mod 4 that is to be tested for primality. For simplicity 
we use the abbreviations of [36], psp{a), epsp{a), spsp{a), to denote, respectively, 
a pseudoprime, an Euler pseudoprime, and a strong pseudoprime, to base a. 

Let e{p) = Eind e(n) = (^) , for I? = — 4Q the discriminant of 

— Px + Q with characteristic roots a = a{P,Q), a = a(P,Q). We will 
assume that gcd(2QD,n) = 1. 

A number of probable prime tests are based on suitable properties in Fp 2 . As 
with the Miller-Rabin test in F„, when n = p is prime, for both roots y G F „2 
of — Px + Q with e(n) = — 1, one has, = 1 mod n, or y^ “ = — 1 mod n for 
some 0 < k < t—1, where — 1 = 2*u with u odd. The exponent 2^u = is 
still too large for obtaining strong testing conditions. More restrictive ones are 
being obtained via = 1, respectively Q mod n, according as e(n) = 1 or 
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— 1. As the former case constitutes an ordinary Fermat condition, in combination 
with a Fermat test, it only makes sense to test for the latter one. Thus, unless 
stated otherwise, we will throughout assume e(n) = —1. 

Composite integers n fulfilling = Q mod n for e{n) = —1 are known 

as quadratic field based pseudoprimes w.r.t. (P,Q), abbrev. QFpsp{P,Q). 

If (^) =1 and if a, a denote the two roots y, then, for n prime, the two roots 

need to evaluate to the same value, even with the smaller exponent (n— e(n))/2 in 
place of n — e(n), i.e., we must have mod n. Composite 

integers fulfilling this criterion are denoted elpsp{P,Q). In our case, for n = 
I mod 4 and e(n) = —I, the value (n+ l)/2 is odd which already constitutes the 
strong Lucas test and the pseudoprimes are denoted slpsp{P, Q) . 

Lemma 1. Let e(n) = — 1 and let n = 1 mod 4 &e a eomposite integer that 
fulfills = Q mod n for = 1. Then n is both psp{Q) 

and QFpsp{Q). If mod n then n is slpsp{P,Q) for 

= —1 and, moreover, = 1 for all prime divisors p of n. 

Proof. This follows directly from the proof of Theorem 3, [26], because for n = 
1 mod 4, (n — e(n))/2 = (n + l)/2 is odd. □ 

The above conditions are tested in [16], however, Grantham does not consider 
the nature of the value modulo n. In [26], a formula was obtained 

when n is a prime, and this was used to establish a new pseudoprimality test. 

Proposition 1. If a is any root of — Px + Q, and if of = Q mod n for n 
prime, then a 2 = a 2 mod n, and this is equivalent to mod n, 

if e{n) = 1, and equivalent to ( ) a mod p, if e{n) = —1. 

Often a composite n fulfills the condition = Q mod n, but 

not the stronger one of Proposition 1. In that case gcd(a^”“'^*^"^)/^ ± a,n) is a 
proper factor of n. This is the final condition being tested in Step 3 of the test. 

2.2 The Main Problem 

While the values a{P, Q)^, a{P, Q)^, and mod n theoretically can be evalu- 
ated with less than (3-1- o(l)) log 2 n multiplications [16], the practical application 
of the techniques in [16] is rather involved. For general Q, the fastest algorithm is 
given in [16]. Unfortunately, this requires special representation of k in terms of 
shortest addition chains. Brauer’s Theorem [18] guarantees that asymptotically 
the number of multiplications in such shortest addition chains is o{log{n)), that 
is, it is vanishingly small compared to the number of squarings needed. This gives 
the asymptotically small running time of the Grantham test, but in practice, the 
required number of multiplications seems to be more like 4.51og2(n). 

For a = a{P,Q), a = a{P,Q), define the Lucas functions by Um{P,Q) = 
and Vm{P, Q) = o’” -I- a™. It can be shown that these are always integers 
(see, e.g., [41]). 
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Thus, for the QT’-based tests (with, as usual, e(n) = —1), the condition 
q,("+i)/ 2 = ^("+i)/2 jnod n, is equivalent to the vanishing of Uk{P, Q) mod n for 
k = (n + l)/2. This, in turn can easily be checked via the condition 

DUk{P,Q) = 2Vk+i{P,Q) - PVk{P,Q) (1) 

by means of two V- values, which is much easier than evaluating the U- function. 

Moreover, the computation of Vk{P,Q) for Q = 1 is much easier and faster 
than for general Q. Thus, it is natural to ask, how easily the required Vk{P, Q), 
Vk+i{P,Q) can be computed via some shifted parameters {P',Q') with Q' = 1. 

A transformation between Vk{P,Q) and V 2 k{PA) is given in [13]. Unfortu- 
nately this induces a shift of the degree from k to 2k and cannot be applied in 
our scenario, which requires k = {n — e(n))/2 = (n -I- l)/2 to remain odd. 

As in our case Q is a square, we apply the following well-known identities, 

Vk{ca,a?) = a^Vk{c,l), aUk{ca,a^) = a^Uk{c,l). (2) 

Hence, if a{P/a, = a{P/a, = ±1 mod n and = Q mod 

n, then also a(P, = q;(P, mod n. 

Our main goal is a method for the separate computation of a root a of Q 
modulo n and for the evaluation of a{P/a, 1)*, which in total is faster than the 
evaluation of a{P,Q)^, and which also induces a smaller failure rate. In detail, 
for the former, 

— Find a practical root-finding algorithm that returns the root a of Q, = 1 
for n prime, but with high probability discloses n as composite, otherwise. 

— If the value a returned is a correct root of Q modulo n, then this should 
impose restrictive pseudoprimality conditions on n. 



Remark 1. 1. If a is indeed a correct root of Q mod n, then the QF- part of the 

proposed test implies a(P, = a(P, mod n. 

If the root-finding algorithm imposes the condition a(”~i)/2 = mod n on 
n, then the above quantity is congruent to ±a mod n (see Proposition 1) 
and in that case n is also spsp{Q). 

2. This shows why the case n = 3 mod 4 in [28] is easier to deal with. Not 

only can the root be efficiently computed via mod n, but also, even 

when n is composite, this implies that = ±1 mod n. 

3. While the root-finding algorithms for n = 1 mod 4 are more expensive, they 
will be used in a way so as to induce some additional testing conditions. 



2.3 Square Roots Modulo n and Conditions on the Pseudoprimes 



The case that n = 1 mod 4 | . 

Let n = 2’’s -I- 1, with s odd, and call r the order of n. Suppose (^) = —1 
and ^ = —1 mod n. Then the 2-Sylow subgroup Sr of Z* is cyclic of order 

2’’. Shanks’ root-finding algorithm [38] is based on the relation aS = bQ mod n 
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for some b in some Sk- When n is prime, there exist new k,b,a such that this 
condition still holds and the index k decreases. Subsequently b gets pushed down 
into smaller subgroups of Sk until finally 6 G S'o = {!}, and the solution is found. 

Note that the algorithm hinges on the existence of some u as above. But that 
criterion is not limited to n being prime. Modulo n, that condition on the u will 
either fail, or often the result of the algorithm will not be a root of Q. Indeed, 
the algorithm of Shanks not only efficiently performs Step 2 of the proposed test, 
but also works as an efficient probable prime test (see also [29]). 

// Detailed Description of Step 2 of the Proposed Test. 

INPUT: n = 2^s+l, 2/fs, (^2^=1. 

OUTPUT: a, a square root of Qmodn, or ‘n is composite’. 

1 . (Precomputation) 

Choose randomly u G Z* with = —1. Let z ^ mod n. If not 

=— Imodn, declare n to be composite. 

2. (Initialization) 

Let k^r — 1, t ^ mod n , a ^ Qt mod n, b ^ at mod n. 

3. (Body of the Algorithm) 

While b ^ I mod n (*) 

m^l, B^b, founds false; 

While m < k and found = false (**) 

if i? = 1 then OUTPUT g gcd(i?o ~ B,n); 

// proper factor of n found 
if B = —I then found <— true; 
else TO<— TO+1, Bq ^ B , i?<— B^ mod n . 

If found= false then OUTPUT ‘n is composite’. 

// otherwise we have B = b^ = —1 mod n 
Update t^z , z^t , b ^ bz , a ^ at mod n , k^m. 

4. OUTPUT ±amodn. 

The algorithm always returns a root of Q when n is prime. This also holds 
for n = 3 mod 4. Note the more restrictive condition (**), b^ = —1 mod n 
for m > 1, as opposed to the original one by Shanks, b^ =1 mod n. This 
introduces an additional pseudoprimality testing condition. 

Lemma 2. If a composite n passes the precomputation, then n is spsp{u) . If 
the original b is congruent to 1 modulo n, or if n fulfills condition (**) at least 
for the first loop (*), then n is spsp{Q) and = ±1 mod n. 

Moreover, n passes at most r—1 iterations of the loop (*), where r = iy 2 {n—l) . 
Additionally, for k > 2 and random input Q, n passes k iterations of (*) with 
probability at most 1/3^. 

Proof. The first assertions are obvious. Now suppose n — 1 is at least divisible 
by 2^ and that n enters the loop (*) at least twice. 

Note that after each iteration (*) the relation = Qb mod n holds. Once 
b=l mod n, the desired solution is found. 
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From the previous iteration we have = — 1 mod n. Let h = . From 

the latter condition and the fact that = —1 mod n it follows exactly as 

when n is a prime, that h has order 2™ modulo n. 

If firstly = — 1 mod n (what would happen if n were prime) , then 

hh = t^b has order dividing 2™“^ and the new h enters the next loop. But this 
means that each new b has an order which is at least by one factor in 2 smaller 
than the previous b. This explains the condition that each new m has to be less 
than k (which was the previous m). Equivalently, the sequence of the k in the 
loop are strictly decreasing, so that altogether there are less than r iterations of 
(*) (unless n is already previously disclosed as composite). 

On the other hand, if ^ mod n, but b^ = — 1 mod n, then 

{hbY ^ 1 mod n and {hb) (which is the new b) has order 2™, as does the 
previous b. In this case, the new b does not fulfill (**). 

It follows from above that unless the algorithm already terminated, we have 

=1 mod n for some M . If M = 0, we are done. Otherwise, we are seeking 
the smallest m with b^ = —1 mod n, when 6 yf 1, i.e., when m > 1. In that 
case, in analogy to the Miller-Rabin test, the first such power of b before 1 has 
to be —1. When we first arrive at 1, without encountering —1, n is immediately 
disclosed as composite, and the gcd above obviously yields a proper factor of n. 
In exactly such a case the algorithm terminates at a point where it would not if 
n were prime. Thus, the above algorithm terminates much faster for composites. 
Precisely, it terminates for each case where b^ =1 mod n, but b^ =1 mod p 
for one prime p dividing n, and 6^ = — 1 mod q for another prime q\n. It does 

not terminate when b^ = — 1 mod p for all p|n. If n is the product of two 
primes, the latter only happens in one out of three cases, while if n has more 
factors, the probability not to terminate is even smaller. Thus, in at most 1 out of 
3 cases each additional iteration of (*) does not terminate. The desired assertion 
follows from the hypothesis that the Q are randomly chosen (subject only to the 
condition = —1), which implies that all the b values are random. □ 

For the special case n = 5 mod 8 the above can be achieved even simpler. 

// Alternative Case of Step 2 of the Proposed Test. 

1 . Select randomly d G Z* . 

If n is not spsp{2df) , declare n to be composite. 

2. Let t ^ (2fi^(5)("“^)/® mod n and i ^ ■ 2d^Q mod n. 

3. If not = — Imodn, declare n to be composite, otherwise 

a = zdQ{i — 1) mod n is a square root of Q modulo n. 

When n is known to be prime, this always gives is a square-root of Q via one 
exponentiation only (then clearly the first step can be omitted). 

Lemma 3. If a composite n = 5 mod 8 passes the above algorithm, then a and 
i are correct roots of Q and — 1 mod n, respectively. Moreover, n is spsp{2df), 
as well as spsp{2dfQ). As a consequence, n is also epsp{Q). 
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Proof. This follows since any epsp{a) for (^) = —1 is already spsp{a). □ 

Remark 2. For d = 1 the above algorithm was proposed by Atkin [5], and actu- 
ally constitutes a deterministic root-finding method for primes n = 5 mod 8. 

Step 1 is necessary to have n epsp(Q), which will be required below. We 
incorporate the random value d to minimize the failure probability by means of 
Miller-Rabin with respect to the random base 2df . 

Corollary 1. Suppose a composite integer n passes the proposed test. Then, 
in the case of the square root finding algorithm for n = 5 mod 8, this implies 
a(P, = a(P, mod n, and in the case of the 

Shanks-based root finding algorithm, the latter value is congruent to ±a mod n. 
In both cases, a{P, = a{P, (5)”“*“^ = Q mod n. 

Proof. The first part follows from above. Note that if n passes the root-finding 
algorithm then it is epsp{Q). But if n is epsp{Q) and elpsp{P, Q), then by well- 
known results [24], this implies, a{P, = a{P, = Q mod n. □ 

3 Performance 

3.1 Evaluation of the QE-Based Part 

By property (1), the QF-part can be evaluated via the V- functions only. Using 
the identities, V 2 k{P, 1) = Vk{P, 1)2-2 and U 2 fe+i(P, 1) = Vk{P, l)14+i(P, 1)-P, 
this can be done via a simple powering ladder analogously as for exponentiation. 

The algorithm in [34] can easily be modified to obtain two consecutive V- 
values, as required. The operations are done modulo n. 

INPUT: nr = 6j2l , the binary representation of m, and P. 

OUTPUT: The pair Vm{P,l) and Vm-\-i{P,l)- 

1. (Initialization) Set d\ *— P, ^2 ~ 2 . 

2. (Iterate on j) For j from I — 1 down to 1 do 

If bj = l, set di^did 2 ~P, d 2 ^d\ — 2. 

If bj = 0, set d 2 ^ d\d 2 — P , d\ ^ d\ — 2. 

3. (Evaluate) Let wi <— d±d 2 — P, VO 2 ^ d\ — 2 . 

If 60 = 1 return (rui, Pwi — W 2 ) , else return (w 2 ,Wi). 

Thus, the pair U(„+i)/ 2 (Pj 1), U(„+i)/ 2 +i(P, 1) may be computed modulo n 
using fewer than 21 og 2 (n) multiplications mod n and log 2 n additions mod n. 
Half of the multiplications mod n are squarings mod n. 

// Detailed Description of Step 3 of the Proposed Test. 

— Let k = {n-\-l)/2 and evaluate (Vfc(P', 1), Vfe_|_i(P', 1)) modulo n. 

— Test, if 2Vfc+i(P',l) = P'Vfc(P', 1) mod n. If not, declare n to be 
composite . 

— Compute gcd(Vfc(P', 1) ± 2, n) . If this reveals a factor of n, 
output the factor. Otherwise declare n to be a probable prime. 
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3.2 Runtime- Analysis 

In [16], J. Grantham suggested a unit measure for a probable prime test based 
on the running time of the Miller-Rabin test. An algorithm with input n is said 
to have running time of k selfridges if it can be computed in {k + o(l))log 2 n 
multiplications mod n. For simplicity, squarings are counted as multiplications. 

As exponentiation to the tth power can be done in (1 + o(l)) log 2 1 multipli- 
cations by using easily constructed addition chains [18], the Miller-Rabin test 
has running time of at most 1 selfridge. 

Theorem 2. — For random input Q and u, the proposed test, via the general 

root finding algorithm, has average running time 4 selfridges. 

— For the n = 5 mod 8 based root finding algorithm, the proposed test always 
has running time less than 4 selfridges. 

Proof. By the above. Step 3 of the proposed test requires at most two selfridges. 

The Atkin-based method always requires two exponentiations, so we only 
need to consider the general Shanks-based root finding algorithm. Precomputa- 
tion and initialization require one exponentiation each. It follows from [21] that 
the number of multiplications averaged over primes n = 1 mod 4 is o(l) log 2 (n). 
The additional squarings that we require in Step 1 for the Miller-Rabin test base 
u can be comprised in the o(l)log 2 (n) multiplications above. 

We upper bound the number of multiplications in the worst case required by 
the loop (*). If n is spsp{u) then z = generates Sr, the 2-Sylow subgroup of 
Z* . So Sr has order 2’’, Sr-i has order 2’’“^, is generated by 2 ^, and in general, 
Sr-i has order 2’’“* and is generated by for i = 0, 1, ..., r. 

The condition (**) indicates in which of the 2-subgroups 6 is in. Alternatively, 
we can consider the values that k takes in the algorithm, which also (except for 
the first k), specifies the subgroup where b is in. Namely, b G Sk\ Sk-i. 

Recall that the sequences of the fc- values have to be strictly decreasing. E.g., 
for order r = 4, the possible /e-sequences are, (4, 1), (4, 2), (4, 2, 1), (4, 3), (4, 3, 1), 
(4, 3, 2), (4, 3, 2, 1). Generally there are 2’’“^ such fc-sequences. 

For random Q’s and u’s the values b are random as well and it can be 
shown (see [21, p. 235]) that every /c-sequence has the same probability. Lind- 
hurst determined the total number of multiplications Cr over all the possible 
fc-sequences and then divided by the number of sequences, 2’’“^, to get the av- 
erage. Then, the average number of multiplications (after the initialization), is 
Cr/2^-^ = (r^ -h7r- 12)/4-fi 1/2’'-! (see [21, p. 236]). 

Although all sequences are equally like, they can be grouped into those with 
the same length. The 2’’“^ sequences of order r are obtained by fixing the r 
as first value of the sequence, and by determining the (*^7^) ’ (*' 2 ^) ’ ■ ■ ■ ’ (r-i) 
subsequences (ki,..,ki) of respective lengths 1,2,.. .,r — 1. This shows that an 
average sequence is expected to have length about r/2. Equivalently, on average, 
the loop (*) is iterated r/2 times. Additionally, for r > 8, more than 99 % of all 
sequences have length between [|J and ]"^]. 
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But then Lemma 2 and Lemma 6 below implies that on average at most 
(r +7r-^i2)/4+i/2 — multiplications are to be expected before the algorithm ter- 
minates, when n is composite. Comparing numerator and denominator, we see 
that this is much less than for primes, as repeated iterations of (*) are much less 
likely. □ 

3.3 The Iterated Test 

After the first round it is more efficient to shorten each of the following iterations, 
instead of re-running the entire procedure. For entire iteration, we would achieve 
a failure probability of about 1/2^°* and 4fc selfridges for k rounds. 

Below, the failure rate of the QF-based part will be shown to be much smaller 
than the one based on the root finding algorithm. Yet, each of those parts requires 
about two selfridges. When being iterated, it is more efficient to repeat only a 
part of the root finding algorithm, whilst obtaining the full QF-part. In fact, for 
both of the above root finding algorithms, the first step is only required at the 
first round. This motivates the following shortened version of the proposed test 
for any iterations after the first. 

// Iterations After the First Round of the Test. 

1. (Parameter Selection) As above. 

2. (Square Root Part) 

- Let u and d, accordingly, be the values of the first round 
of the proposed test in Step 1 of the root finding part. 

- Run one of the above root finding algorithms by skipping 
the corresponding Step 1 . 

If the algorithms declares n composite, stop. 

- Let a and P' be as above . 

3. (QF-Based Part) As above. 

4 The Probability Estimate 

The proof of Theorem 1 will be given in a sequence of auxiliary results. The 
general idea is to determine an upper bound on the number of the liars (i.e., 
pairs that pass) and to upper bound the ratio of these to the number of all pairs 
possible as input to the test. It was shown in [16] that for n an odd composite, 
not a perfect square, the number of pairs (P,Q), such that = —1 and 

= I, I < gcd(F^ — 4Q,n) < n, or I < gcd(Q,n) < n, is more than n^/4. 



4.1 The QF-Based Part 

Underlying all the pseudoprimality tests based on quadratic fields is the inves- 
tigation of the powers of the characteristic roots a, a. It is well known that 
if n is any integer with gcd(Q,n) = 1 then there is a positive integer m such 
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that a{P, Q)'^ = a(P, Q)™ mod n. Let p = p{n, P, Q) be the least such positive 
integer. This is usually called the rank of appearance (apparition) [36,41]. 

The rank of appearance has the following properties (see [11,36,39]). 

a{P, Q)"* = a{P, Q)™ mod k if and only if p{k, P, Q)\m, (3) 

p{p, P,Q)\P- e(p), and p(p, P, Q) \ {p - e(p))/2 iff (Q/p) = 1, (4) 

p{lcm{mi, ...,mk)) = lcm{p{mi), ...,p{mk)), (5) 

Ifp^ll a{P,QY^PP^'> -a{P,Q)P^Pp^'> then p{p^,P,Q) = P,Q)- 

( 6 ) 



A necessary condition for the test to pass is a{P, = a{P, mod n. 

Since n+1 for p|n, we need not consider the pairs (P, Q) modulo p°‘ whose rank 
is a multiple of p (compare (6)). Thus, it suffices to investigate the parameters 
whose rank is an odd divisor of p — e(p), since (n + 1) /2 is odd for n = 1 mod 4. 

Given n, the task is to count the number of the liars (P, Q), which is deter- 
mined by the rank of appearance of each of these pairs. But this requires knowl- 
edge of the individual quadratic residue symbols and e(pi) = 
all primes Pi\n. 

Generally, these values are not known for the number n to be tested for pri- 
mality. However, certain conditions on these symbols are automatically satisfied 
when a composite n indeed passes the test. Specifically, by Lemma 1 it suffices 
to consider the case that =1 for any prime p dividing n. We separately 

consider the values e{pi). 



Definition 1. Let n = where to = oj(n) is the number of differ- 

ent prime factors of n. For 1 < i < ui let e = e{pi) G {1,-1}, and call 
(e) = (e{pi), e{pY) the signature modulo n with respect to P and Q, when 

( ^ “ (p^) “ ^{Pi) *• Similarly, we call each e{pi) the signature 

modulo Pi\n, and e(p) the signature modulo any prime p. 



Throughout, P is assumed to be different from 0, since otherwise the rank of 
appearance modulo n is always equal to 2. (This is no restriction as for P = 0 
always (D/n) = 1 in our case.) Proposition 2 was proved in [28] and Proposition 
3 was proved in [25]. 



Proposition 2. Let k, pjfk, be a positive integer and e G {—1,1} a constant. 
For a fixed value of Pq, Pq 0, the number of Q mod p“ such that = 
1, (Po,Q) has signature e mod p, and a{Po,QY = a(Po, Q)^ mod p“, equals 



5 (gcd(fc, — 2) if2\k and 2]^^, and ^ (gcd(fc, ^y^) — l) , otherwise. 



Proposition 3. Let k be a positive integer withpfk and e G { — 1, 1} a constant. 
For a fixed value of Qq, = 1; the number of P mod p“ such that (P,Qo) 

has signature e and a{P, Qo)^ = A(P, Qo)^ mod p“ is, | gcd(fc,p — e) — 1, when 
V 2 {k) > V 2 {p — e), and gcd(fc, ^^) — 1, otherwise. 
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Corollary 2. Let a signature (e) be fixed. Then the number of pairs {P,Q) that 
fulfill a{P, = a{P, mod n with respect to this signature is at 



most 



nti (gcd( 



n+l 

2 



2 



)-i) 



Lemma 4. Let n = 1 mod 4 be an odd integer, not a perfect square. If Pj is a 
prime such thatp^ divides n, then n is slpsp{P, Q) for (^) = —1 with probability 
less than l/{%pj). 

Proof. Let (e) be a fixed signature. By Corollary 2, the number of liars (P, Q) 
with respect to this signature is at most (l/2^‘^) 0^=1 (P* ~ 1)^ ' Y\t=iPV~^ ■ 

If w = 2, then there are two possible signatures with = —1 and so the 
number of the liars is at most (1/2®) • 0^=1 (P* ~ ' 0^=1 pT~^- This gives a fail- 
ure probability of less than (1/23) -0^1 Oti Pi ^ l/(23-n/=i?3/*"^) < 

l/{8pj). For oj > 3 there are always less than 2“ different signatures with 
= —1 and the number of liars is less then (l/2^“) - which gives a 

probability of at most 1 / (2^‘^“^Pj ) < l/{2‘^pj). Finally, if w = 1, so that n = , 

then necessarily aj > 2 by hypothesis and the probability in this case is at most 

l/(2p|). □ 

Typical for pseudoprimality testing based on the Fermat/QP-based combina- 
tions is the fact that = 1 becomes rather unlikely for p|n when (^) = —1. 

Proposition 4. The number of pairs {P, Q) mod n for which a squarefree inte- 
ger n with oj prime factors fulfills a{P, (5)^"“*-i)/2 = a{P, mod n such 

that 1 some Pi\n, is given as follows. It is less than if 

to = 2, less than n(f{n) ^ 4 zs even, and less than 

if CO is odd. 

Proof. See the proof to Proposition 5 in [28], where exactly the number of such 
pairs is being established. □ 



Remark 3. For to = 2 the proof in [28] shows that the above quantities are only 
obtained for strongest divisor properties, like odd(pi -I- l)|n-|- 1 for one Pi\n, and 
odd(pj -I- l)|t(n -I- 1) for t = 3 and the other pfin. Otherwise, the results would 
be much smaller. 

When the test passes for some fixed Q = Qa, then we have for each parameter 
P, = pfin+i )/2 jg either equivalent to a^”“*-i)/2 mod n, 

or to — mod n, where a is independent of P, and by the root finding 
algorithms is uniquely determined by the Qq. For all P that pass, this determines 
a specific general ‘multiplier’ S = resp. S = — modulo n. The 

proof to the next result is analogous to Lemma 5, [26] (see Proposition 4, [28]). 
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Lemma 5. Let n = 1 mod 4 be any eomposite integer, and Q = Qq, as well 
as some ‘multiplier’ S he fixed. If p is any prime dividing n, then there are at 
most I (gcd(^^,p — e(p)) — l) elements P with = ^(p) for which 

a{P, Q) ^ = a{P, Q) ^ = S mod p. 

Corollary 3. For a squarefree n = 1 mod 4 let (e) he a fixed signature. Then the 
number of pairs (P, Q) with a{P, = a{P, mod n 

w.r.t. this signature is at most ^ 2^-1 

Remark 4- It is essential that n = 1 mod 4 to have (n + l)/2 odd. For n = 
3 mod 4 analogous, but more involved results can be obtained, [28]. 

This gives the error rate for each iteration of the test (after the first round) . 

Theorem 3. Let P and Q be randomly chosen in Step 1 of the proposed test. 
Let n = 1 mod 4 6e o composite integer which is not a perfect square and not 
divisible by primes up to B. Then the probability that n fulfills a{P, = 

a{P, mod n for of = Q mod n, is given as follows. 

— If n is not a product of exactly three prime factors, it is less than 1/2^^ + 
4/p2 < 1/131040. 

— Ifnis the product of three different primes, and ifn is further epsp{Q), then 
it is less then 4/P^ + 3(i?^ + 1)/2(P^ — 3P^). 

Proof. When n is not squarefree, Lemma 4 gives the result. If a squarefree n has 
an even number of prime factors we apply Proposition 4, where the probability 
becomes largest for w = 2 in which case it is less than 5/{2'^B) < 1/160000. 

Further, if n = P 1 P 2 P 3 is squarefree and has exactly 3 prime factors, we 
can use Lemma 2.11 of [16]. In this Lemma, Grantham separately considers the 
cases, ^ some i, and ^ ^ ^ = —1 for all i. By Proposition 4 

(which corresponds to Lemma 2.9 of [16] when u> is odd), the former case yields 
a probability of 4/P^. In the latter case, necessarily = Q mod pi and 

Q,p.+i = Q mod Pi so that q;”“P* = 1 mod pi, since n is epsp{Q) by hypothesis 
(see Corollary 1). This congruence holds for exactly gcd(n — pi,pf — 1) elements. 
Since n has only three factors, these gcd^ s cannot all be equal to its maximal 
value, pf — 1. Indeed, Grantham gives an upper limit for these quantities. From 
this, he obtains the probability for such pairs which pass the test. By adding 
both cases, the probability can be bounded by 4/P^ + 3(P^ + 1)/2(P'^ — 3P^). 

It remains to consider the case where n is squarefree and divisible by an odd 
number to of at least 5 prime factors. The number of pairs with ("p) = 1 lor at 
least one p|n is again by Proposition 4 less than n^/B'^. So it suffices to consider 
the pairs with = ~1 for all primes p|n. In this case the number of pairs is 

by Corollary 3 at most (1/2^““^) Y\{pi — 1)^. When adding these two cases, the 
probability is upper bounded by 1/2^^ + 4/i?^ which is less than 1/131040. □ 
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4.2 The Square Root Finding Based Part 

It is well-known that when a is taken randomly from Z* , the probability for a to 
be a Miller-Rabin liar is at most 1/4. If is fixed to some special value (e.g., 

— 1), then in that case there are only </(n)/2 such a as possible input values to 
the Miller-Rabin test. Yet, even for fixed jacobi symbol, we show below that in 
our case the failure rate is smaller than the expected 2/4. 

The following result can immediately be verified. 

Proposition 5. Let n be spsp{u), where (^) = —1, and let p he any prime 
divisor of n. Then, if = —1, we have V 2 {p— 1) = i^ 2 (n— 1), and if = 1, 
we have i' 2 {p ~ 1) > V 2 {n — 1). 

Notation: Let iy{n) denote the largest integer such that 2'^^”) divides p — I 
for each prime p dividing n. As above, write n — 1 = 2'’s with s odd. 

Proposition 6. Suppose n is spsp{a) for (^) = —1. Then a G S-i(n) where 

S-i{n) = {a mod n : ^ = — Imodn}. Moreover, we have ffS-i{n) = 

2Mn)-lMn) np|„gcd(s,p-l). 

Proof. If (^) = —1 then there exists p\n with = —1 and by Proposition 

5, V 2 {p — 1) = r = v{n). Moreover, in that case, V 2 {p — 1) = V 2 {ordp{a)). By 
a standard result for n being spsp{a) (see e.g., [2]), we also have j^ 2 (ordp(a)) = 
V 2 {ordq{a)) , so that V 2 {ordq{a)) = v{n) for any q\n. In particular, if ^ = 

— 1 mod n for some 0 < i < r — 1 (the first case for n being spsp{a)), then 
of ® = — 1 mod q for any q\n. Note also that the case a® = 1 mod n (the 
second case for n being spsp{a)), is impossible, since af'^~PI’^ = —1 mod n by 
hypothesis. 

The cardinality ffS-i{n) follows from [13, p. 128]. □ 



Lemma 6. Suppose an odd eomposite integer n, not a perfect square, is not the 
product of exactly three prime factors. Let a € Z* be chosen randomly from the 
set of all b with (^) = —1. Then the probability that n is spsp{a) is given as 
follows. If n = P 1 P 2 where pi = 2^t + 1 and P 2 = 2^+^f -I- 1, 2ft, it is at most 
1/4. Otherwise, it is at most 1/8. 

Proof. We follow the proof of Lemma 3.4.8. in [13]. Then the desired probability 
can be determined via 

= i TT P-1 

2ffS-i{n) 2 gcd(s,p — 1) 

p“||n 

Note that each factor (p — l)/(2'^(”)“^ gcd(s,p — 1)) is an even integer. 
Then, if ui{n) > 4, we have 4>{n) / {2ffS-\{n)) > 1/2 • (2^) = 8. 
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If ui{n) = 2, we distinguish the following cases. Suppose 1 for one 

p\n. Then gcd(s,p — 1) < (p — l)/8 and therefore </>(n)/(2#5_i(n)) > 

1/2 • (2 - 8 ) = 8 . 

Now, let 1 for one p|n, where <5 equals 0 or 1. Write the two primes 

in the form pi = + 1 and pa = + 1- 

For the case that ti ^ Arnault [4, p. 877] showed that tijs and tajs is 
simultaneously impossible. This means that for at least one pi, gcd(s,pj — 1) < 
ti/3. If d = 0, then ^(n)/(2#5_i(n)) > 1/2 • (2 • 6) = 6, while if 5 = 1, this 
introduces an additional factor of 2, and (/(n)/(2#5_i(n)) > 12. 

The result of the Lemma follows, since for = —1 and w(n) = 2 there 

is one prime factor pi with = 1, so that by Proposition 5, z^a(Pi — 1) > 
V 2 {'n — 1) = v^iPj ~ 1) = v(n). This means we do have <5 = 1, as required. 

Finally, the special case pi = 2^t + 1 and pa = 2^+^t + 1 implies tjs, in which 
case (j){n)/{2^Sj{n)) > 1/2 • (2 • 4), since 2'^(”)+^|pa — 1. □ 



4.3 Proof of the Main Result 

Proof of Theorem 1 . Suppose firstly that n is a product of three different prime 
factors. Then Theorem 3 and Lemma 2, respectively Lemma 3, give the result. 

For the Atkin-based root finding method Lemma 3 asserts that n is spsp{2cP). 
Since (f ) = — 1 for n = 5 mod 8, we can apply Lemma 6. By 

assumption, d is chosen randomly in the square root finding algorithm. For 
random selection of this basis, the condition on n to be spsp{2d^) is independent 
of the QF-based test. 

If n is not such a special two-factor integer as described in Lemma 6, this in- 
troduces a factor of 1/8 (for each random d) in addition to the failure probability 
obtained above in Theorem 3 for the test that checks the QF-condition. 

If n passes the Shanks-based method, it firstly is spsp{u) for u with = 
— 1. For randomly chosen u this again introduces a factor of 1/8 in the failure 
probability. 

Finally, for both types of the root finding algorithms, if n does have the 
special two-factor form, then it follows easily that Proposition 4 introduces a 
much smaller failure rate than above (the corresponding number of the QF- 
liars, which is based on the quantities gcd(n -|- l,p± 1), becomes much smaller 
when the odd part of p — 1 divides n — 1). In total, for w = 2 the largest failure 
rate applies to the general type of two factor numbers. 

Thus, we have the failure rate, for the first round, Fi = 1/2^° -I- 1/(2F^), 
and for fc — 1 additional iterations, Fi • (1/2^^ -I- 4/F^)^“^. For larger k the B 
proportion is negligible, so that for a total of k rounds we have failure approxi- 
mately 1/(2^° • = 1/2^^^+^. □ 
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5 Open Problems and Further Remarks 

While with the much smaller failure rate of 1/1048350, our test has running 
time 4 times that of the Miller-Rabin test. We do not know how effectively the 
failure rate still can be reduced, when allowing more time for evaluation (for 
each round). On the other hand, the question is, how to optimally tackle the 
tradeoff between the reliability and the running time, and what the limits for a 
test with much larger running time are, so that it practically still makes sense. 

Strong pseudoprimes with respect to at least 4 random bases exist very often. 
Below n = 1000 there are 54 such composites with at least four non-trivial bases 
as liars. Our test, without trial division and, for simplicity d = 1 (see Lemma 3) 
for n = 5 mod 8, would for any possible pairs of parameters detect these. 

Similarly, it is extremely easy to construct strong pseudoprimes with respect 
to at least 4^, 4^, ..., random bases. We do not know, computationally, how 
much more effort is required for the generation of pseudoprimes for the iter- 
ated proposed test (say, for the n = 5 mod 8 algorithm with d = 1). Here, the 
typical Fermat /Lucas restrictions come into play and considerably limits the 
effectiveness of the Fermat- based generation methods for pseudoprimes. 

On the other hand, sometimes it seems that many repeated iterations would 
not be necessary, if the input parameters have certain advantageous values. For 
the Miller-Rabin test, it is known that the bases 2, 3,5,7 seem to work better, 
as they are primitive roots for most primes. 

Even more effectively, the special choice of the parameters in the Baillie-PSW 
test essentially improves on its reliability. 

If the proposed tests were run for one pair of parameters only, it is not known 
to what extent, and for which parameters it is most reliable. 

Note added in proof: I. Damgard and G. Frandsen recently established a 
QF-based test with average case error estimates [15]. 
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Abstract. We describe in this article how we have been able to ex- 
tend the record for computations of discrete logarithms in characteristic 
2 from the previous record over F2503 to a newer mark of F2607, using 
Coppersmith’s algorithm. This has been made possible by several prac- 
tical improvements to the algorithm. Although the computations have 
been carried out on fairly standard hardware, our opinion is that we are 
nearing the current limits of the manageable sizes for this algorithm, and 
that going substantially further will require deeper improvements to the 
method. 



1 Introduction 

Among the most common paradigms upon which public key cryptographic sche- 
mes rely are the difficulty of the factorization of large integers (for the RSA 
cryptosystem), and the difficulty of computing discrete logarithms in appropri- 
ate groups (for the Diffie-Hellman key exchange protocol [14], ElGamal cryp- 
tosystem [16], and ElGamal and Schnorr [38] signature schemes). Appropriate 
groups for discrete logarithm cryptosystems are multiplicative groups of finite 
fields, the group of points of elliptic curves [26,33], and also the jacobians of 
curves of higher genus [27,4,18]. The level of security reached by the use of these 
different groups varies a lot. Both the factorization of large numbers [29] and 
the computation of discrete logarithms in finite fields [11,19,3] can be addressed 
in subexponential time. This in turn has implications on the security of some 
elliptic curves cryptosystems, where the discrete logarithm problem on the curve 
reduces to the discrete logarithm problem on (an extension of) the curve’s defini- 
tion field [32,17]. This applies in particular to supersingular elliptic curves, where 
the MOV reduction [32] makes the discrete logarithm problem subexponential. 

This being said, the existence of a subexponential attack does not automat- 
ically rule out a cryptosystem. A thorough account on which computations a 
cryptanalist can do with the current technology is necessary. While a tremen- 
dous amount of work (and GPU time) has been put towards the factorization 
of larger and larger numbers (S. Gavallar et al. used the Number Field Sieve 
to factor numbers as big as 512 bits [6,9], and even up to 774 bits numbers of 
a special form [7]), the computation of discrete logarithms in finite fields does 
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not seem to looked at so frequently. For prime fields, a recent work by Joux and 
Lercier [22] computed logarithms in Fp with p having 120 decimal digits, i.e. 399 
bits. For fields of characteristic 2, Gordon and McCurley [20] almost* computed 
logarithms in F2503, but that was back in 1993. This makes it hard, today, to 
make a reasonable guess on how difficult a characteristic 2 finite field discrete 
logarithm problem actually is. Subsequently, when the discrete logarithm on an 
elliptic curve reduces to some finite field of characteristic 2, it is not easy to tell 
how big this field should be for the cryptosystem to be secure. 

In this context, our goal was to investigate how far we could go today in 
computing discrete logarithms in F2»». The fastest algorithm for this purpose is 
due to Coppersmith [11] and has complexity 0(exp((c + o(l))n3 (logn)^)), for 
a small constant c « 1.4. This complexity makes it comparable to the Number 
Field Sieve [29], when addressing the factorization of an n-bit number. The 
503-bit discrete logarithm record of Gordon and McCurley [20] was done using 
massively parallel supercomputers at Sandia National Laboratories. As far as 
we know, no recent state-of-the-art computations have been achieved. For our 
computations, we used standard hardware: the typical computers we used were 
much like everybody’s desktop PC. Nonetheless, we have been able to carry the 
record to a few digits higher than before by computing discrete logarithms in 

F2607 . 

Section 2 of this article outlines Coppersmith’s algorithm. Section 3 reviews 
the rationales that drive the choice of each individual parameter in the algorithm. 
Sections 4 to 8 detail how we addressed the difficulties showing up in several parts 
of the algorithm. Section 9 shows the technical data on how the computations 
went along. 

At the time of this writing, the computations over F2607 are not finished. The 
sieving part is completed, and the linear algebra is underway. The computation 
of the solution to the linear system is expected to be finished by the beginning 
of the autumn 2001. As a very last-minute news, Joux and Lercier [23] appear 
to have computed logarithms in F2521, using the general function field sieve 
approach [2]. This approach is fairly different from the one adopted here, and 
is not addressed in this paper. However, the result presented by [23] is highly 
encouraging. 



2 Coppersmith’s Algorithm 

Throughout this article, we will let K denote the field F2»«, which will be repre- 
sented as the quotient F2[A]/(y(x)), where / is a monic irreducible polynomial 
of degree n over F2. We will often talk of the elements of K merely as polynomi- 
als. It will be understood that what we actually mean is a class of polynomials 
inside this quotient. Likewise, the degree of a non-zero element of K will be the 
minimum degree of the polynomials representing it (always between 0 and n— 1). 

* The computations had not been fully carried out, since the resulting linear system 
was never solved 
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It will sometimes be convenient to write / as X” + fi, where /i is a poly- 
nomial. For the purposes of the algorithm, fi will be chosen so as to have the 
smallest possible degree. It is believed, but not proven, that such an fi exists 
whenever we allow its degree to grow as O(logn). 

Coppersmith’s algorithm belongs to the family of index- calculus algorithms. 
This means that we first select a factor base B, and aim at computing the 
logarithms of its elements. For this, we gather a collection of relations among 
them. The relations will be of the form 0^=1 ~ where the tt^’s are the 

elements of the factor base. For reasons that will become clear later, this is 
referred to as the sieving part. This part can easily be distributed. Once we 
have enough relations involving the elements of the factor base, we obtain their 
logarithms as the solution of a (usually huge) linear system (we take the log 
of each relation). This is the linear algebra part. Implementations can be done 
efficiently on multiprocessor shared-memory machines, but such computers are 
expensive. Distribution of the computation across a network of not-so-expensive 
computers is very hard. The knowledge of all these logarithms, if the factor 
base is big enough, enables us to compute any logarithm in K easily. We will 
not detail that third part here since it is far easier than the two others. The 
interested reader might consult Coppersmith’s original article [11] for reference. 

The factor base B consists of all irreducible polynomials with degree less than 
a chosen bound b. It is known that B has roughly elements (see for instance 
[31]). Up to now. Coppersmith’s algorithm is very resemblant to Adleman’s [1, 
5,3], which computes discrete logarithms in any Galois field, no matter the char- 
acteristic (but with poorer complexity than Coppersmith’s). The key difference 
is in the production of linear relations. To build relations among the elements of 
B, we choose random relatively prime polynomials A and B of degrees dA and 
ds, respectively. Let A: be a power of 2 near ^J^nfdA, and h = [^]. Then we 
write: 



C = + B, 

D = + B'^ = A'^X’^’^-^fi + B^ [/]. 

An appropriate choice of the parameters keeps the degrees of C and D bal- 
anced, around ^/ndA- For each such produced pair, we want to know whether it 
is smooth or not. The pair (C,D) is smooth when both polynomials have their 
irreducible factors inside B. Of course, the bigger the factor base, the more likely 
this is. A smooth pair will give us a linear relation among the logarithms of the 
elements of B, since if we denote them tt^, 1 < i < ij^B, we can find integers ai 
and such that: 



C = \{^T D = \{ , ^DC-^ = n 

i i i 

log = 0 [2" - 1] 

i 
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Once we have gathered enough relations, we are facing a (fairly big) linear 
system that has to be solved, the unknowns being the logarithms of the elements 
of B. 



3 Choice of the Parameters 

Coppersmith’s algorithm introduces many parameters that may seem arbitrary 
at first glance. In [11], Coppersmith computed the asymptotical optimum value 
for each of them. We will not redo this analysis here, but rather briefly discuss 
the practical importance of each of the parameters, especially taking care of 
implementation realities like available hardware. 

The choice of b. This main parameter, whose asymptotical optimum value is 
r^i/3(log controls the ratio between the work amounts in the first and second 
stages. The bigger b, the easier the first stage (even if we have twice as many 
relations to produce, the probabilities of smoothness increase drastically with 
b). On the other hand, increasing 5 by 1 almost doubles the size of the linear 
system in the second stage. Since the linear algebra is hardly distributable, the 
available hardware enforces a strong limit on the size of this system (otherwise 
the matrix would not fit into memory). 

The choice of d a and ds- Originally, Coppersmith grouped them as a single 
parameter chosen asymptotically “near 6”*. These parameters account for the 
number of pairs to test. Taking into account the probability of smoothness, we 
have to make sure that the available coprime {A, B) pairs will be 

enough to produce the required number of relations among the elements of B. 
Of course, the sad news is that increasing dA and ds raises the degrees of C and 
D, and hence lowers the probability of smoothness. We have split Coppersmith’s 
single parameter in two because it is usually possible to choose ds a little bit 
above dA without increasing the degrees of C and D (the optimum difference 
between the two is Therefore, we can maximize the number of pairs 

which are available. 



The choice of k. Ideally, C and D have almost the same degree, their optimal 
value being n^/^(logn)^/^. In fact, these can be somewhat unbalanced from the 
practical point of view. The parameter k is there to keep these polynomials in 
the same range, but unfortunately the requirement that A: be a power of 2 gives 



us little control over it. The asymptotical best value for k is 




For the problems we are concerned about, k = 4 appeared to be the correct 
choice. It might be that, at n = 607, we are nearing the cross-over point between 
fc = 4 and k = 8, but fc = 8 is still inadequate. One other aspect about the 
choice of k is that half of the coefficients in the linear system are —k (the other 



An asymptotic ratio is computed in [11], depending on the algorithm used for linear 
algebra 
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ones being I’s). This brings a complication to the linear algebra (the structured 
gaussian elimination, namely), which could only be worsened by the choice of a 
bigger k. 

The choice of f\. Another hidden parameter lies in the choice of fi. Usually, one 
can choose among a couple of candidates for /i . The ones of low degree have a 
clear advantage due to the influence on deg D, but [20] shows that polynomials 
with small factors are also worth investigating. The reader is referred to [20] for 
a thorough discussion on the choice of /i . 

In our computations, for n = 607, the following parameters were chosen: 
& = 23 (hence ffB = 766, 150), = 21, = 28, fc = 4, = 152. As for the 

choice of /i , it turned out that A® + + A® + A^ + A + 1 had an overwhelming 

advantage, being simultaneously the candidate of smallest degree and with only 
small factors: /i factorizes as (A + 1)^(A^ + A + 1)^(A^ + A + 1). Given these 
parameters, the respective degrees of C and D were 173 and 112. 

4 Description of the Polynomial Sieve 

In Coppersmith’s original version of the algorithm, the smooth pairs were lo- 
cated by repeatedly applying a smoothness test to all pairs of the allowed range. 
Gordon and McCurley [20], as an alternative, designed an efficient polynomial 
sieve, which helped to reduce the time spent on each pair (smooth or not). The 
idea is as follows. For A fixed, we maintain a big array of integers (initially 0) 
associated to the different pairs to be tested, that is, all the possible B’s. Let g 
be an irreducible polynomial. We want to add deg(/ to the values associated to 
the B’s* satisfying: 

B = AX>^[g]. (E) 

Doing this sieve efficiently implies being able to step quickly through all multiples 
of g. This can be done without awkward polynomial multiplications using Gray 
codes. For any non-zero positive integer x, let l{x) denote the index of the least 
significant bit set in the binary representation of x (starting at 1(1) = 0, 1(2) = 1). 
Then the congruence class of AA^ mod g among the polynomials of degree less 
than or equal to ds is given by the set of values of the sequence defined by: 
Bq = AX^ mod g, Bi = i?i_i -I- for 0 < t Of course, it is 

worthwhile to precompute the X^ g’s, since these differ from each other only by 
arithmetic shifts. 

This sieve is done for a certain collection of irreducible polynomials. One 
can also take into account the contribution of powers of irreducible polynomi- 
als, adding degg to all B's satisfying B = AX^ [g^]. If the sieve is done for 
all irreducible polynomials g, and also their powers, the value in each table cell 
is precisely the degree of the smooth part in the factorization of the associ- 
ated quantity C = AX^ -I- i? (an entry for which the congruence holds modulo 

* Or, equivalently, the pairs, since A remains fixed. 
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accumulates a total contribution of j deg g from the consecutive sieves with 
g, g'^ , . . . , g^). Therefore, an entry in the table which has a value of deg C auto- 
matically corresponds to a pair with C smooth. 

In real life, one does not use all the relevant irreducible polynomials for the 
sieve, and an important improvement comes from the use of incomplete sieves. 
Two parts of the sieve are actually very expensive: the sieve over small irre- 
ducibles on the one hand, because there are many cells to update for each small 
irreducible, and the sieve over big irreducibles on the other hand because the 
initialization cost is high (and the number of irreducibles of a given degree raises 
with the degree). Therefore, we considered skipping these parts. Doing so, we 
lose accuracy, because the smoothness of C is only evaluated from the contribu- 
tion of medium-size polynomials. Instead of deg C, we use as qualification hound 
the average contribution from medium-size polynomials to a smooth C . If the 
standard deviation of this quantity is high, it will be hard to recognize pairs 
yielding smooth C’s among the set of all pairs to be considered. Since the subse- 
quent distinction between useful and useless pairs is done on a per-pair basis (a 
factorization job, in fact) their number should not grow too much. We found in- 
terest in skipping the sieve over irreducibles of degree 1 to 9, because their total 
contribution to smooth polynomials did not deviate too much from its average 
value, whereas we only skipped high degree 23, because otherwise we would have 
had to lower drastically the qualification bound to catch sufficiently many of the 
pairs yielding a smooth C, which in turn would have made the factorization cost 
too high. 

Based on the same ideas, it is not always worthwhile to sieve over powers 
of an irreducible polynomial. Locating cells corresponding to pairs divisible by 
g^ for an irreducible polynomial g and an integer i is practically pointless if the 
expected number of cells to update is too small (this number is '^®s®). 

In fact, the only powers that we found worthwhile to sieve with were squares of 
polynomials of degrees 10 and 11 . 

It could be tempting to try to also do a sieve with D, but the situation is quite 
different. The initialization of the sieve must be done with Bq = A(X^^“”/i)^/* 
mod g, for g an irreducible polynomial. This computation is more complicated 
than previously. Also, this only works when g is an irreducible polynomial, and 
not when it is a power of an irreducible, because a fc-th root might not exist 
modulo g^ . This difficulty is due to the same particularity of D that Gordon and 
McCurley already noticed in [20]: this polynomial is more likely to be square-free 
than it would be if it were random (and therefore it is less likely to be smooth). 
As we have just seen, this last point is not too disturbing since one hardly uses 
powers of irreducibles for the sieve. 

Sieving over D turned out not to be useful in our case, since the first sieve 
(over C) already eliminated most of the pairs, and eventually testing the smooth- 
ness of D on a per-pair basis was more efficient. Nonetheless, sieving over D only 
instead of C could be useful in different settings, depending on how deg C and 
deg I? compare to each other. In F 2607 , the parameter k seems to be better 
around 4, and as a consequence, the degrees of C and D are not really balanced: 
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deg C is much higher. If we were about to carry out computations in, say, F2997, 
fc = 8 would probably be a better choice. And deg D would become automati- 
cally bigger than deg C . A sieve over D in this situation would therefore enable 
us to discard much more pairs than its counterpart (because there would be 
very few smooth D's), and the benefit in the factorization part would probably 
compensate for the sieve’s relative drawback. 

5 Using Large Primes 

One well-known improvement to the sieving part of index calculus algorithms is 
the so-called large prime variation. The idea is that aside plain, full relations, we 
allow partial relations, corresponding to pairs which are smooth up to a certain 
number of big irreducible cofactors (above the factor base bound) called large 
primes. Afterwards, these partial relations are matched together when this is 
possible in order to eliminate the cofactors. The partial relations come almost 
for free in the sieving stage, since they would otherwise have been discarded at 
the end of the factorization stage and not earlier. The degree of large primes must 
of course be kept under a certain bound: allowing for too large “large primes” 
eventually brings no benefit. From our point of view, this approach fits well here. 
We merely have to lower the qualification bound from deg C to deg C — L, where 
L is the maximum allowed degree of large primes. 

When we allow only one large prime, matching partial relations together 
involves only a hashing process in order to be able to spot partial relations 
containing an already met large prime. The number of full relations reconstructed 
this way grows quadratically vs. the number of partial relations. When up to two 
large primes are used (see [30]), an algorithm resembling “union- find” helps to 
find cycles: relation after relation, we build a graph whose vertices are the large 
primes. An edge connects two vertices if a partial relations exists involving them. 
There is also a special vertex named “1”, to which all primes involved alone in a 
partial relation are connected. Under certain conditions*, a cycle in this graph 
will give us a free full relation. The overhead is small, but this cycle detection has 
to be implemented with care because managing a graph with more than 10® edges 
among 2.10® vertices can turn out to be quite awkward. More elaborate schemes 
allow the processing of partial relations with more large primes, see for instance 
[15]. Recently, in the course of the record-breaking factorization of RSA-155, 
S. Cavallar proposed in [8] an efficient scheme for this large prime matching task, 
inspired by structured gaussian elimination like in [37]. We lacked the required 
time to investigate the respective efficiency of all of these different strategies 
when applied to our case. This is a real concern here, because while the multi- 
large-prime schemes have proven to be very efficient in the factorization context, 
this is not completely clear for discrete logarithms. Factorization algorithms 
use relations that are defined up to squares, that is, with exponents defined 
over F2. For discrete logarithms, exponents are defined in a big finite ring, here 

* Slight complications are brought by the fact that our coefficients are not defined 
over F2. 
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Z/(2" — 1)Z- When combining partial relations with large primes in common, 
one can only cancel one large prime at a time. For this reason, the landscape is 
quite different. 

Our computations have been carried out using the double large prime varia- 
tion, that works well even with regard to the coefficient issue. Two large primes 
were allowed. For efficiency reasons (discussed in section 7), only the factoriza- 
tion of C could have two large primes, while D was restricted to only a single 
large prime. 10% of the relations had actually only one large prime, and among 
the remaining relations (that had two large primes), 30% had both large primes 
on the same side (the C side, actually), the rest of the relations having their 
large primes balanced on each side. We did the cycle detection using a straight- 
forward union-find algorithm. Figures about the cycle detection can be found in 
section 9. 

6 Grouping Sieves 

As it is described above, the sieve algorithm uses an array of fixed size, namely 
2^3+1 bytes (assuming one byte per sieve location). Our setup had ds = 28, so 
this makes a sieve area of 512MB, far above what is acceptable. Furthermore, it 
was not certain by the beginning of the sieve whether the outcome of pairs with a 
polynomial B of big degree would eventually be used or not. We decided to have 
a first look at the pairs from which we knew that the outcome would be better, 
that is, the pairs with smaller B’s, and defer the analysis of less promising pairs 
to a later time. Our strategy was to decompose the whole sieving job in chunks 
indexed by fixed parts Af and Bf of the polynomials A and B. The chunks 
consisted of areas of the form: 

chunk(A/, Bf) ={{A, B) = {AfX^^+^ + A^,BfX^^+^ + B^), 

degA„ < 6a, deg By < <5s},with 6a = 6,6b = 24. 

Each chunk could be sieved by the machine handling it in any suitable way. The 
most straightforward approach is to do 2^ = 128 sieves, each of them addressing 
2^® bytes, that is 32MB, for the sieve area. 

Since we ran the job using idle time on many not-so-powerful machines, this 
was still too much memory to be used for some of them. A further possibility 
is to divide the 32MB sieve area into yet more (say 2^, with 7 a small integer), 
smaller sieve areas (of size 2“'>' x 32MB). But when the sieve area becomes 
so small, the initialization cost becomes too important. The expensive task is 
the modular reduction AX^ mod g, which is performed for each g. One can 
precompute the initialization data for the 2'^ sieves, but even after doing that, 
we were unsatisfied with the cost of the initialization, and tried to trim it down 
even more. 

We wanted to achieve this without letting additional bits of B vary, but 
rather sieving over several A’s at a time. This is possible because for reasonably 
close A’s, the initialization for a given g is almost the same. In the following 
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paragraphs, g will denote either an irreducible polynomial, or a power of an 
irreducible polynomial. Inside a given sieve with A completely fixed, we want to 
find the solutions to: 

+B^ = AX^ [5]. 

If we allow some of the lowest bits of A, say e of them (with e < + 1, of 

course) to vary, the equation becomes: 

By + aX^ = AX^ + BfX^^~^^ [(/], with deg a < e. (E’) 

The solutions to this equation form an affine subspace § of the F2-vector space 
V = F © G, with F = {1,X,X'^,...X^^) and G = {X’^, . . . X'^+^-^). The 
expected dimension of § is dimS = fe + l + e — dg- We will try to find S 
using linear algebra over F2. The idea behind this is that arithmetic shifts and 
logical operations take almost no time compared to a polynomial division or 
multiplication. We will consider two situations. 

The easy case is when dg < + 1. S writes down as Sq + S', with a point 

So = AX^ + BfX^^~^^ mod g, and an underlying vector space S' spanned by 
the X^g ioT 0 < i < 6 b — dg, and the mod 5) for 0 < f < 

e. We claim that the computation of these generators costs very little above 
2 modular reductions since once X^ mod g has been computed, inferring the 
X^+t mod g inductively is easy (one bit test and one exclusive-or if needed). If 
we did independent sieves we would have needed 2^ modular reductions (which 
can be anything but cheaper). 

If dg > 6 b + 1, we extend VtoV = F©G, F = F© 

Let S be the set of solutions of E’ in V. A point sq in S is obtained as in the 
previous case, and generators of the underlying vector space S' are the u + 4>{u) 
for u € G, (j) being the linear map from G to F that reduces a polynomial 
mod g. Using gaussian elimination, we can find a point sq G S deduced from 
So and S' if such a point exists, and the generators of S' (the vector space 
underlying S) are the u + <()(u) for u G </>“^(F). This involves finding the kernel 
of a (dimF — dimF) x e matrix, which is expected to be quite easy (perhaps a 
dozen CPU cycles). Although the case where dg > + 1 is unlikely to be met 

often in practice (we don’t want to sieve when deg 5 is too big), we will augment 
this quick description with an example. Suppose we have the following setup: 

g = {X^^ + + A® + + 1)2, 

So = A27 + A26 + a® + 1, 
h = 152, 6b = 24, e = 3. 

The first three columns of the following matrix are the dimF — dimF most 
significant coefficients of the polynomials u + 4>{u) for u = A^+* and 0 < z < e. 
The last one contains the leading coefficients of sq: 

/O 1 0 1\ 

T= 10 11 
\0 0 0 0/ 
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Doing a gaussian elimination on the columns of T, one easily obtains: 

So = So + x'* + e §, 

+ X'^+2). 

Of course, this example is a bit particular in that the last row of T is zero, 
therefore the dimension of S' is one, instead of the expected value 0. Other cases 
can occur: for instance, if sq had had a non-zero coefficient in X'^^, then we 
would have had no solutions to the equation E' inside V. 

We have shown two ways to play with the memory available to the siever. These 
can actually be mixed together. Using parameters 7 and e together, a chunk is 
divided in sieves, each of them using bytes of memory. 

The influence of the two parameters 7 and e is shown on figure 1. Timings are 
in seconds runtime on 450MHz Pentium IPs. The percentages show the timing 
difference versus the standard sieve (which has 7 = e = 0). Three figures are 
present in each table cell. The figures are always normalized to reflect the time 
needed to sieve a (fictitious) 128MB sieve area. The first one, on which the 
effect of both 7 and e is the most striking, shows the time spent in initializing 
the sieve (or, in re-reading again and again the precomputed initialization data 
when 7 > 0. Precomputation time in this case is also included). The second 
figure is the time spent in the sieve itself, that is, adding degg to each table 
cell corresponding to a pair divisible by g, for all possible g’s. The effects of 
7 and e on this sieving time are hardly noticeable (the variations are likely to 
be due to operating system overhead). The third figure is the total time spent 
including allocation overhead and final pair detection (but not the factorization, 
which comes afterwards, and is irrelevant here). On the right and the bottom, 
the actual memory sizes used by the sieve area are given. Since our jobs have 
been running in background on otherwise used machines, we preferred not to use 
too much memory. Using the setting 7 = 4, e = 3 was a satisfying compromise, 
with a mere 16MB sieving area. 

7 Factorization of the Pairs 

Once good pairs have been located, the actual production of the relations (or 
partial relations) requires the factorization of the pairs {C,D). Efficient algo- 
rithms exist for polynomial factorization, but our actual problem here is not the 
usual one. Instead of the factorization of one huge polynomial (of degree several 
thousands for instance), we have to deal with the factorization of a huge num- 
ber of relatively small polynomials (in our case, their degree is less than 200). 
Therefore, asymptotically better behaving algorithms might not be worthwhile. 
Furthermore, we are willing to give up as soon as we suspect the polynomial 
might not be smooth after all. In a few words, merely applying some classical 
distinct degree factorization algorithm can turn out to be a considerable waste 
of time. We built a factorization scheme based on several specific improvements 
that turned out to be worthwhile. 
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7 = 2 
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'i‘4:87 (-55%) 
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7 = 3 


63.04 (189%) 
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'36:22 ( 1 5%) 

109.98'%+4%) 
152.66 (+5%) 


19.37 (-41%) 

109.46 4+3%) 
134.31 (-6%) 


40.48 (-69%) 

105.58' (0%) 
121.11 (-16%) 


7 = 4 


96:66 (-190%) 
108.68 (+2%) 
210.72 (+46%) 


54:6.6 (+63%) 
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’28:27 (-15%) 

110.42'4+4%) 
144.14 (0%) 


44:6.9 (-55%) 
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Fig. 1. Influence of 7 and e on the sieving time. 



The pairs that constitute the input to the factorization step are such that C 
has a reasonable probability to be smooth: it has been selected for this purpose. 
D, however, has no reason to be smooth. Therefore, the first thing to try out is 
a smoothness test on D, in order to avoid useless computations on all pairs with 
non-smooth D’s. The smoothness test applied is the same as in [11], except that 
we want to allow large primes. The 6-smooth part of D is computed as 



/ 



^smooth — gcd 



D, D' {X‘^' + X) mod D 



V 



i=i+LfJ 



In some cases (if D has a very big square factor), I?smooth might not actually be 
6-smooth, but that’s exceptional. Concerning the cofactor jj ^ we are facing 
a design choice, since we can either allow only one large prime, that is, allow a 
cofactor of degree at most C, or permit several large primes, setting for instance 
the cofactor bound to the looser 2C. However, in the latter case, the cofactor 
needs not factor kindly into two large primes of degree less than C (actually, it is 
most likely not to) . The best choice depends on what we want to do with partial 
relations. In our experiments, less than 1% of the D’s passed the former test, 
while around 12% passed the latter (hence there were more pairs to be factorized 
afterwards, resulting in a 25% increase of the factorization cost). Since we used 
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only two large primes in total, the yield was better when using the first, more 
restrictive test. If we were to allow three large primes or more (cf [15]), the 
second, looser, test would probably be more adequate. 

Once D has passed the test, and has therefore an acceptable probability to 
be smooth (save the cofactors), we need to factorize C and D. Originally, when 
running our program on smaller examples like F 2313 , we found it useful to track 
down small factors either by explicit trial division or by precomputations and 
table lookup. The idea is to quickly compute the valuation of a given polynomial 
with respect to some irreducible. Of course, this is trivial for the valuation with 
respect to X . Let us explain briefly how this can be done for the valuation with 
respect to X + 1. We notice that {X + 1)^® = X^^ + 1. Since our implementation 
represents the polynomials over F 2 using one bit per coefficient, computing a re- 
mainder modulo is fast. Assuming we have a 32-bit machine, this requires 

less than ® -I- 3 operations (exclusive “OR”s, one shift and one “AND”). If 
we have a precomputed table holding the values of iy(Q) {v is the valuation) 
for all polynomials Q of degree 0 to 15 (this requires 32KB), we can obtain 
Vg{P) with high probability. Indeed, we have Vg{P mod -I- 1) = Vg{P) un- 
less P = 0 [A^® -I- 1] (in which case we have an inequality <). Once we have this 
value, we merely have to do one division by the appropriate (precomputed) power 
of A -I- 1. If the valuation is at least 16, we repeat the operation on the cofactor. 
From the basic observation that a remainder modulo a cyclotomic polynomial 
is easily computable, we could extend this approach for irreducibles of degree 
up to 4. Alas, the improvement obtained from this method was not significant 
for the case of n = 607, probably because the average degree of C and D made 
the contribution of little factors too small for this to be useful. We also tried to 
factor the relations by sieving with all or part of the possible irreducibles that 
could appear in the factorization, but this brought no significant improvement 
either. 

Since it turned out that our attempts towards removing some of the factors 
by hand were not worthwhile, the whole factorization job was achieved by a 
general-purpose factorization algorithm (in any case, if we did remove some of 
the factors by trial division, the cofactor would have still had to be factorized 
via such an algorithm). We used Niederreiter’s algorithm [35], which proved 
four times faster than a classical distinct degree factorization procedure. The 
explanation of this lies of course in the small degree of our polynomials, and in 
the fact that we work over F 2 , for which Niederreiter’s algorithm is well suited. 

8 Improvements to the Linear Algebra Stage 

The sparse matrix emerging from the sieving has roughly columns, and a 
bigger number of lines (we had a 40% excess). This matrix is extremely sparse: 
the number of non-zero terms (called the weight) of a given line correspond- 
ing to a smooth pair (C, D) is actually the number of distinct factors in the 
factorization of DC~^ . Most relations are also obtained from recombinations 
of partial relations, so the weight for a recombination of s relations is s times 
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the average number of factors in a factorization like DC~^. In our case, this 
amounts to an average weight of the lines for the whole matrix of 67.7. Handling 
such systems requires well-suited algorithms, designed to take advantage of the 
sparsity as much as possible. Actually, this is a well studied subject, since sparse 
matrices arise in many domains. For the literature about sparse matrices coming 
from discrete logarithm or factorization problems, one can consult [37,43,12,34, 
25,28]. Two particularly annoying points are relevant to our case. Unlike linear 
system that arise from factorization problems, ours is defined over a big field, 
Z/(2607 _ Second, unlike what happens with Adleman’s algorithm [1], or 
with the number field sieve when applied to the discrete logarithm problem [19], 
our coefficients are not always ±1. As explained earlier in this article, half of 
them are ±fc. 

In order to solve our system, we first apply the well-known structured gaussian 
elimination as described in [37]. This algorithm takes advantage of both the 
sparsity of the matrix, and also of the “unbalanced” shape of its lines: each 
line in the matrix corresponds to a relation, and the coefficients on the left 
correspond to small factors, while those on the right correspond to big factors. 
The probability of a given polynomial to be divisible by a given factor of degree d 
being the density of the matrix is much higher on the left part (small factors) 
than on the right part (big factors). The structured gaussian elimination starts 
from the right end of the matrix (which is extremely sparse) and tries to remove 
lines and columns without increasing (if at all) the matrix density. 

We modified the original process described in [37] in the spirit of what is 
done in [42]: we evaluate, at each step, the influence of each possible operation 
to the cost of the linear system solving algorithm that follows the SGE. The 
better steps towards the reduction of the linear algebra cost are taken, until 
nothing interesting can be done anymore. This process is able to shrink down 
the matrix to a fraction of its original size. Here, having many coefficients equal 
to ±k on input causes lines to be multiplied quite often while pivoting is done. 
Since a given line cannot be multiplied too many times (otherwise we would 
have to allow the coefficient to grow above one machine word), this makes the 
elimination less efficient. 

Afterwards, we found it enlightening to use the block Wiedemann algorithm. 
This algorithm has been proposed by Coppersmith in [12], extending a previous 
algorithm by Wiedemann [43] . Another algorithm, the block Lanczos algorithm 
[34], is often preferred to the block Wiedemann algorithm. We used the latter 
because it gave us an opportunity to successfully experiment the accelerating 
procedure described in [39]: the crux of the block Wiedemann algorithm is the 
computation of a linear generator for a matrix sequence (a matrix analogue to 
the Berlekamp-Massey algorithm), and [39] uses FFT to reduce the complexity 
of this task from to 0{N\o^ N), achieving a 50 times speedup for the 

computation undertaken here. The block Wiedemann algorithm performs well 
both theoretically and in practice. See [40,41,24,25,39] for several insights on 
the algorithm. The block Wiedemann algorithm is interesting in the fact that 
at least for one part of the algorithm, several machines holding a private copy 




120 



E. Thome 



of the matrix (for which they need to have the proper amount of memory) can 
each do a part of the work without communication between them. Therefore, 
one can regard this as a partial distribution. We found that the optimal number 
of machines to be used simultaneously in this computation was 4 (luckily, we 
had that number of machines able to hold the 400MB matrix in RAM) . 



9 Computations over F2607 

The comprehensive sieving part took about 19, 000 MIPS years. As a comparison, 
the factorization of RSA-155 required 8,000 MIPS years. The outcome of the 
sieving processes, in terms of relations per hour, dropped from 1000 relations 
(full or partial) per hour with the very first chunks (the degrees were still small) 
to 400 afterwards, and eventually 100 for the very last ranges of data. Almost 
all the sieving area up to ds = 28 has been needed (a more thorough usage of 
this area could have been achieved if we did not use incomplete sieves, but the 
trade off was clear in their favor). Of these relations, of course, most were partial 
ones. The total amount of data produced by these sub-processes nears 10GB. 
The cycle detection algorithm ran approximately for one day and produced the 
biggest part of the relations at the end: 815, 726 relations were reconstructed 
using cycles of length going from 2 to 40. All of these cycles were linked to the 
special vertex “1”, which is not surprising given the size of the corresponding 
connected component. More than 650, 000 relations were obtained from cycles of 
length 3 or more, which shows that using the double large prime variation was 
a winning choice. Meanwhile, we only produced 217, 867 genuine full relations. 
Additional data can be found in table 1. The average weight of these 1, 033, 593 
relations in total was 67.7, the maximum weight being 524. We discarded the 
relations whose weight was above 120, since these were definitely too heavy to 
be useful. We were left with 904,004 relations, involving 765,427 columns (the 
average weight dropped to 64.3). 

We ran a structured gaussian elimination algorithm (SGE) on this matrix. 
The schedule time for SGE was approximately one day. We were able to divide 
by two the cost of the subsequent block Wiedemann algorithm. The matrix 
obtained after the SGE had size 484, 603 x 484, 603 with an average line weight 
of 106.7. One can find this reduction ratio quite disappointing compared to 
ratios typically achieved in other contexts. This could be a consequence of the 
fact that most relations were recombined ones. These were therefore denser, and 
had coefficients somewhat bigger than other lines, which impairs the reduction 
ratio of the SGE. 

The block Wiedemann algorithm is currently underway, in the process of 
finding an element of the kernel of this matrix. We expect it to be finished by 
the beginning of the autumn. It should be noted that since 2®°^ — 1 is prime, the 
linear algebra task cannot be eased anyhow by the Chinese remainder theorem. 
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Table 1. Data from computations in F 2607 



Size of the factor base 


766,150 polynomials 


Total number of relations 


1,033,593 relations 


Full relations 


217,867 relations 


Cycles obtained 


815,726 cycles 


Partial relations used (< 2 large primes) 


60,128,419 relations 


Large primes involved 


85,944,405 polynomials 


Relations with only one large prime 


5,992,928 relations 


Cycles of length 2 


150,566 cycles 


Cycles of length 3 


142,031 cycles 


Cycles of length 4 


123,900 cycles 


Cycles of length 5 


101,865 cycles 


Cycles of length 6 or more 


297,364 cycles 


Size of the biggest cycle 


40 edges 


Size of the biggest connected component 


22,483,158 edges 


Size of the second biggest connected component 


167 edges 


Number of connected components with 1 edge 


22,025,908 components 


Number of connected components with 2 edges 


2,726,940 components 


Number of connected components with 3 edges 


848,691 components 



10 Conclusion 

Computation of discrete logarithms in F2607 is now a matter of weeks (linear 
algebra is in its last phase). As was predicted by Gordon and McCurley in the 
conclusion of their article [20], this was far from an easy task, and the compu- 
tation took enormous proportions. Today’s supercomputers might achieve the 
work we did in quite a reasonable time, but going further will necessarily imply 
more advanced techniques, including, but probably not limited to, the use of four 
large primes (taking into account the remark on the coefficient issue in section 5) . 
The conclusion of our computation is that one can not seriously claim that dis- 
crete logarithms in, say, F2997, are within the reach of a computation of the type 
we have undertaken. A very well-funded institution (e.g. governmental) could 
perhaps go that far, but this is much likely to involve a tremendous (and highly 
expensive) computational effort. An implication of our work to how we should 
regard the security of an elliptic curve cryptosystem with a MOV reduction [32] 
of the discrete logarithm problem to the discrete logarithm problem in a field 
F2™, is that if n is around 1, 000, attacking such a problem is very hard, and if 
n is around 1, 200, this size is twice above the computational mark that we have 
just set. Therefore, the security of such a cryptosystem in the latter case can 
be seen as no lower than the security of an RSA-1024 cryptosystem, given that 
RSA-512 schemes have been successfully attacked using computational means 
comparable to ours. 
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Abstract. This paper describes several speedups and simplihcations for 
XTR. The most important results are new XTR double and single ex- 
ponentiation methods where the latter requires a cheap precomputation. 
Both methods are on average more than 60% faster than the old methods, 
thus more than doubling the speed of the already fast XTR signature 
applications. An additional advantage of the new double exponentiation 
method is that it no longer requires matrices, thereby making XTR easier 
to implement. Another XTR single exponentiation method is presented 
that does not require precomputation and that is on average more than 
35% faster than the old method. Existing applications of similar methods 
to LUC and elliptic curve cryptosystems are reviewed. 

Keywords: XTR, addition chains, Fibonacci sequences, binary Eu- 
clidean algorithm, LUC, ECC. 



1 Introduction 

The XTR public key system was introduced at Crypto 2000 [10]. From a security 
point of view XTR is a traditional subgroup discrete logarithm system, as was 
proved in [10]. It uses a non-standard way to represent and compute subgroup 
elements to achieve substantial computational and communication advantages 
over traditional representations. XTR of security equivalent to 1024-bit RSA 
achieves speed comparable to cryptosystems based on random elliptic curves 
over random prime fields (ECC) of equivalent security. The corresponding XTR 
public keys are only about twice as large as ECC keys, assuming global system 
parameters - without the last requirement the sizes of XTR and ECC public 
keys are about the same. Furthermore, parameter initialization from scratch for 
XTR takes a negligible amount of computing time, unlike RSA and ECC. 

This paper describes several important speedups for XTR, while at the same 
time simplifying its implementation. In the first place the field arithmetic as 
described in [10] is improved by combining the modular reduction steps. More 
importantly, a new application of a method from [15] is presented that results in 
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an XTR exponentiation iteration that can be used for three different purposes. 
In the first place these improvements result in an XTR double exponentiation 
method that is on average more than 60% faster than the double exponentiation 
from [10]. Such exponentiations are used in XTR ElGamal-like signature verifica- 
tions. Furthermore, they result in two new XTR single exponentiation methods, 
one that is on average about 60% faster than the method from [10] but that 
requires a one-time precomputation, and a generic one without precomputation 
that is on average 35% faster than the old method. 

Examples where precomputation can typically be used are the ‘first’ of the 
two exponentiations (per party) in XTR Diffie-Hellman key agreement, XTR 
ElGamal-like signature generation, and, to a lesser extent, XTR-ElGamal en- 
cryption. The new generic XTR single exponentiation can be used in the ‘sec- 
ond’ XTR Diffie-Hellman exponentiation and in XTR-ElGamal decryption. As 
a result the runtime of XTR signature applications is more than halved, the 
time required for XTR Diffie-Hellman is almost halved, and XTR-ElGamal en- 
cryption and decryption can both be expected to run at least 35% faster (with 
encryption running more than 60% faster after precomputation). 

The method from [15] was developed to compute Lucas sequences. It can 
thus immediately be applied to the LUG cryptosystem [18]. It was shown [16] 
that it can also be applied to EGG. The resulting methods compare favorably 
to methods that have been reported in the literature [5]. Because they are not 
generally known their runtimes are reviewed at the end of this paper. 

The double exponentiation method from [10] uses matrices. The new method 
does away with the matrices, thereby removing the esthetically least pleasing as- 
pect of XTR. For completeness, another double exponentiation method is shown 
that does not require matrices. It is directly based on the iteration from [10] and 
does not achieve a noticeable speedup over the double exponentiation from [10], 
since the matrix steps that are no longer needed, though cumbersome, are cheap. 

This paper is organized as follows. Section 2 reviews the results from [10] 
needed for this paper. It includes a description of the faster field arithmetic 
and matrix-less XTR double exponentiation based on the iteration from [10]. 
The 60% faster (and also matrix-less) XTR double exponentiation is presented 
in Section 3. Applications of the method from Section 3 to XTR single expo- 
nentiation with precomputation and to generic XTR single exponentiation are 
described in Sections 4 and 5, respectively. In Section 6 the runtime claims are 
substantiated by direct comparison with the timings from [10]. Section 7 reviews 
the related LUG and EGG results. 

2 XTR Background 

For background and proofs of the statements in this section, see [10]. Let p and 
q be primes with p = 2 mod 3 and q dividing p“^ —p+1, and let g he a, generator 
of the order q subgroup of F*e. For h G F*e its trace Tr{h) over Fp 2 is defined 
as the sum of the conjugates over Fp 2 of h: 
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Because the order of h divides p® — 1 the trace over Fp 2 of h equals the trace of 
the conjugates over Fp 2 of h\ 

(1) Tr{h) = Tr(hP^) = Tr(hP^). 

If h G (g) then its order divides p'^ — p + 1, so that 

Tr{h) = h + hP-^ + h~P 

since p^ = p—1 mod {p^ —p+ 1) and = —p mod {p^ —p+1). In XTR elements 
of {g) are represented by their trace over F^ 2 . It follows from ( 1) that XTR makes 
no distinction between an element of {g) and its conjugates over F^ 2 . 

The discrete logarithm (DL) problem in {g) is to compute for a given h G {g) 
the unique y G {0, 1, . . . , g — 1} such that gP = h. The XTR-DL problem is to 
compute for a given Tr{h) with h G (y) an integer yG{0,l,...,g— 1} such that 
Tr{gP) = Tr{h). If y solves an XTR-DL problem then {p — l)y and —py (both 
taken modulo q) are solutions too. It is proved in [10, Theorem 5.2.1] that the 
XTR-DL problem is equivalent to the DL problem in (y), with similar equiva- 
lences with respect to the Diffie-Hellman and Decision Diffie-Hellman problems. 
Furthermore, it is argued in [10] that if q is sufficiently large (which will be the 
case), then the DL problem in (y) is as hard as it is in F*e. This argument is the 
most commonly misunderstood aspect of XTR and therefore rephrased here. 

Because of the Pohlig-Hellman algorithm [17] and the fact that p® — 1 = 
(p — l)(p-|- l)(p^ -l-p-l- l)(p^ — p-l- 1), the general DL problem in F*e reduces to 
the DL problems in the following four subgroups of F*e : 

— The subgroup of order p—1, which can efficiently be embedded in Fp. 

— The subgroup of order p + 1 dividing p^ — 1, which can efficiently be embedded 
in Fp 2 but not in Fp. 

— The subgroup of order p^ -|- p -|- 1 dividing p® — 1, which can efficiently be 
embedded in FpS but not in Fp. 

— The subgroup of order p^ — P + 1, which cannot be embedded in any true 
subfield of Fp6 . 

So, to solve the DL problem in F*e in the most general case, four DL problems 
must be solved. Three of these DL problems can efficiently be reformulated as 
DL problems in multiplicative groups of the true subfields Fp, Fp 2 , and FpS of 
Fp6 . With the current state of the art of the DL problem in extension fields, these 
latter three problems are believed to be strictly (and substantially) easier than 
the DL problem in F*e. But that means that the subgroup of order p^ — p+1 is, 
so to speak, the subgroup that is responsible for the difficulty of the DL problem 
in F*s. With a proper choice of q dividing p'^ — p+1, this subgroup DL problem 
is equivalent to the problem in (y). This implies that the DL problem in (y) is 
as hard as it is in F*e , unless the latter problem is not as hard as it is currently 
believed to be. It also follows that, if the DL problem in (y) is easier than it is 
in F*e, then the problem in F*g can be at most as hard as it is in F*, F* 2 , or 
F* 3 . Proving such a result would require a major breakthrough. 
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Thus, for cryptographic purposes and given the current state of knowledge 
regarding the DL problem in extension fields, XTR and F^e give the same secu- 
rity. For p and q of about 170 bits the security is at least equivalent to 1024-bit 
RSA and approximately equivalent to 170-bit ECC. 

XTR has two main advantages compared to ordinary representation of ele- 
ments of (g): 

— It is shorter, since Tr(h) G Fp 2 , whereas representing an element of (g) 
requires in general an element of F^e, i.e., three times more bits; 

— It allows faster arithmetic, because given Tr{g) and u the value Tr{g^) can 
be computed substantially faster than g“ can be computed given g and u. 

In this paper it is shown that Tr(g^) can be computed even faster than shown 
in [10]. 

Throughout this paper, c„ denotes Tr(g^) G Fp 2 , for some fixed p and g of 
order q as above. Note that Cq = 3. In [10,11,12] it is shown how p, q, and Ci can 
be found quickly. In particular there is no need to find an explicit representation 
of g G Fp6. 

2.1 Improved Fp 2 Arithmetic. Because p = 2 mod 3, the zeros a and of 
the polynomial — 1) /{X — 1) = -|- X -|- 1 form an optimal normal basis for 

Fp 2 over Fp. An element x G Fp 2 is represented as Xia + X 2 ce'^ with xi,X 2 G Fp. 
From if follows that x^ = X 2 cx + X\a^, so that p-th powering in Fp 2 is 

free. In [10] the product {xia + X 2 a^){yice + y 2 C(‘^) is computed by computing 
a^igi, X 2 y 2 , (a^i + 2 ^ 2 ) (gi + y 2 ) G Fp, so that Xiy 2 + X 2 yi G Fp and the product 

{X2V2 - Xiy 2 - X 2 yi)a + {xiyi - Xiy 2 - X 2 yi)a^ G Fp 2 

follow using four subtractions. This implies that products in Fp 2 can be com- 
puted at the cost of three multiplications in Fp (as usual, the small number of 
additions and subtractions is not counted). 

For a regular multiplication oi u,v G Fp the field elements u and v are 
mapped to integers u,v G {0, 1, ... ,p — 1}, the integer product w = uv € Z is 
computed (the ‘multiplication step’), the remainder w mod p G {0, 1, ... ,p — 1} 
is computed (the ‘reduction step’), and finally the resulting integer w modp is 
mapped to Fp. The reduction step is somewhat costlier than the multiplication 
step; the mappings between Fp and Z are negligible. The same applies if Mont- 
gomery arithmetic [13] is used, but then the reduction and multiplication step 
are about equally costly. 

It follows that the computation of {x\a + X 2 a'^){yiO! + y 2 oi^) can be made 
faster by computing, in the above notation, wi = X 2 P 2 — X\y 2 — X 2 yi G Z 
and W 2 = xiyi — Xiy 2 — X 2 yi G Z using four integer multiplications, followed 
by two reductions wi mod p and W 2 mod p. This works both for regular and 
Montgomery arithmetic. Because the intermediate results are at most 3g^ in 
absolute value the resulting final reductions are of the same cost as the original 
reductions (with additional subtraction correction in Montgomery arithmetic, at 
negligible extra cost). As a result, products in Fp 2 can be computed at the cost of 
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just two and a half multiplications in Fp, namely the usual three multiplication 
steps and just two reduction steps. If regular arithmetic is used the speedup 
can be expected to be somewhat larger. It follows in a similar way that the 
computation of xz — yz^ G Fp 2 for x,y,z G Fp 2 can be reduced from four 
multiplications in Fp to the same cost as three multiplications in Fp; refer to [10, 
Section 2.1] for the details of that computation. Combining, or postponing, the 
reduction steps in this way is not at all new. See for instance [4] for a much 
earlier application. 

This results in the following improved version of [10, Lemma 2.1.1]. 

Lemma 2.2 Let x,y,z G Fp 2 with p= 2 mod 3. 

i. Computing x^ is free. 

ii. Computing x'^ takes two multiplications in Fp. 

Hi. Computing xy costs the same as two and a half multiplications in Fp . 
iv. Computing xz — yz^ costs the same as three multiplications in Fp. 

Efficient computation of c„ given p, q, and ci is based on the following facts. 

2.3 Facts. Fact 2b follows from Lemma 2.2 and Facts lb and 2a. The other 
facts are derived as in [10]. 

1. Identities involving traces of powers, with u,v G Z: 

a) C-u = Cup = c^. It follows from Lemma 2.2. z that negations and p-th 
powers can be computed for free. 

b) Cu+v = CuCy — c^Cu-v + Cu- 2 v It follows from Lemma 2.2.z and iv that 
Cu+v can be computed at the cost of three multiplications in Fp if c„, 
Cy, Cu-v, and Cy- 2 v are given. 

c) If Cu = Cl, then Cy denotes the trace of the w-th power g^"" of p“, so that 

^UV — • 

2. Computing traces of powers, with u G Z: 

a) C 2 u = Cy — 2cP takes two multiplications in Fp. 

b) C 3 „ = Cy — + 3 costs four and a half multiplications in Fp, and 

produces C 2 « as a side-result. 

c) c „+2 = cic„+i — c^Cy + Cy-i costs three multiplications in Fp. 

d) C 2 u_i = Cy-iCy — cfcP + c^_^_l costs three multiplications in Fp. 

e) C 2 u+i = Cy+iCy — cicP + cf^-i costs three multiplications in Fp. 

Let Sy denote the triple (c„_i, c„, c„+i); thus = (3,ci,c^ — 2c^). The triple 
S 2 U -1 = (c 2 («-i),C 2 ti-i,C 2 „) can be computed from Sy and ci by applying 
Fact 2a twice to compute C 2 („_i) and C 2 « based on c„_i and c„, respectively, and 
by applying Fact 2d to compute C 2 u-i based on Sy = (c„_i, c„, c„+i) and ci. 
This takes seven multiplications in Fp. The triple S' 2 u+i can be computed in a 
similar fashion from Sy and ci at the cost of seven multiplications in Fp (using 
Fact 2e to compute C 2 u+i). 

Let V be a non-negative integer, and let v = be the binary rep- 

resentation of V, where Vi G {0, 1}, r > 0, and = 1. It is well known that 
the u-th power of an element of, say, a finite field can be computed using the 
ordinary square and multiply method based on the binary representation of v. 
A similar iteration can be used to compute S 2 V+ 1 , given ^i. 
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2.4 XTR Single Exponentiation (cf. [10, Algorithm 2.3.7]). Let *S*i, Cl, 
and Vr-i,Vr- 2 , ■ ■ ■ ,vo € {0, 1} be given, let y = 1 and e = 0 (so that 2e + 1 = y; 
the values y and e are included for expository purposes only). To compute S 2 V +1 
with V = ^*2*, do the following for t = r — 1, r — 2, . . . , 0 in succession: 

Bit off If Vi = 0, then compute S2y-1 based on Sy and ci, replace Sy by S2y-1 
(and thus S' 2 e+i by S' 2 ( 2 e)+i because it follows from 2e+l = y that 2(2e) + l = 
4e + 1 = 2y — 1), replace y by 2y — 1, and e by 2e (so that the invariant 
2e + 1 = y is maintained) . 

Bit on Else if Vi = 1, then compute S 2 y+i based on Sy and ci, replace Sy by 
5'2y+i (and thus S' 2 e+i by S' 2 ( 2 e+i)+i because it follows from 2e + 1 = y that 
2(2e + 1) + 1 = 4e + 3 = 2y + 1), replace y by 2y + 1, and e by 2e + 1 (so 
that the invariant 2e + 1 = y is maintained) . 

As a result e = v. Because 2e + 1 = y the final Sy equals 5'2t,+i. Note that Vr-i, 
or any other Vi, does not have to be non-zero. 

Both the ‘bit off’ and the ‘bit on’ step of Algorithm 2.4 take seven multipli- 
cations in Fp. Thus, given an odd positive integer t < q and Si, the triple 
St = (ct_i, Ct, Ct+i) can be computed in 71og2t multiplications in Fp. In [10] 
this was 8 log 2 t because of the slower field arithmetic used there. The restriction 
that t is odd and positive is easily removed: if t is even, then first compute St~i 
and next apply Fact 2c, and if t is negative, then use Fact la. 

In Algorithm 2.4, the trace Ci of y in S'! = (cq,Ci,C 2 ) = (3,ci,cf — 2c^) can 
be replaced by the trace c* of the t-th power y* of y (cf. Fact Ic): with ci = Ct, 
Si = (co,ci,C 2 ) = (3,Ct,C2t) = (3,ct,c| — 2cf), and the previous paragraph, the 
triple Sy = (Cy-i,Cy,Cy+i) = {c(^y -i)t , Cyt , C(y +i)t) cau be computed in 71og2U 
multiplications in Fp, for any positive integer v < q. 

Now let V = above and let 



s+r— 1 

v' = 2'^k + V = fi2* 

i=0 

for some integer fc > 1. After the first s iterations of the application of Al- 
gorithm 2.4 to S\, Cl, and Vs+r-i,Vs+r- 2 , ■ ■ ■ the value for e equals k and 
Sy = S 2 k+i- The remaining r iterations result in 5'2t,'+i = 5'2r+ifc+2«-i-i, and 
are the same as if Algorithm 2.4 was applied to Sy (as opposed to S'!) and 
Vr-i,Vr- 2 , ■ ■ ■ ,vo- It follows that if Algorithm 2.4 is applied to S 2 k+i, ci, and 
Vr-i,Vr- 2 , ■ ■ ■ ,vq, then the resulting value is S' 2 ’-+ife-i- 2 j;-i-i- Note that the ci’s do 
not have to be non-zero. Thus, given any (odd or even) t < 2'"'^^, Sk, and ci, 
the triple S 2 r+ik+t can be computed in 71og2 t multiplications in Fp. This leads 
to the following double exponentiation method for XTR. 



2.5 Matrix-Less XTR Double Exponentiation. Let a and b be integers 
with 0 < a,b < q, and let Sk and ci be given. To compute cyk+a do the following. 




Speeding Up XTR 131 



1. Let r be such that 2’’ < g < 2'’+^. 

2. Compute d = mod q and t = a/d mod q. 

3. Compute S 2 r+ik+t- 

— Use Facts 2a and 2e to compute S 2 k+i based on Sk- 

— If t is odd let t' = t, else let t' = t — 1. 

— Let t' = 2v + 1. 

~ Let V = with Vi € {0, 1} (and v^-i, Vr- 2 , ■ ■ ■ possibly zero). 

— Apply Algorithm 2.4 to S 2 k+i, ci, and Vr-i,Vr- 2 , ■ ■ ■ ,vo, resulting in 

— If t is odd then S' 2 >-+ifc+t = <S' 2 >-+ife+t'i else use Fact 2c to compute 
^ 2 ’'+^k+t = *S' 2 >’+ife+t'+i based on S' 2 r+i;;_i_j/. 

4. Let Cl = C 2 r+ik+t- 

5. Compute Ai = (co,ci,C 2 ) = (3,ci,cf — 2cf) (cf. Fact Ic). 

6. Apply Algorithm 2.4 to Si, ci and the bits containing the binary represen- 
tation of d, resulting in Sd = (cd-i, Cd, Cd+i)- 

7. The resulting Cd equals Cd( 2 ^+ik+t) mod q = Cbk+a- 

Algorithm 2.5 takes about 141og2g multiplications in Fp. This is a small con- 
stant number of multiplications in Fp better than [10, Algorithm 2.4.8] (assum- 
ing the faster field arithmetic is used there too). For realistic choices of q the 
speedup achieved using Algorithm 2.5 is thus barely noticeable. Nevertheless, 
it is a significant result because the fact that the matrices as required for [10, 
Algorithm 2.4.8] are no longer needed, facilitates implementation of XTR. In 
Section 3 of this paper a more substantial improvement over the double expo- 
nentiation method from [10] is described that does not require matrices either. 

3 Improved Double Exponentiation 

In this section it is shown how c^k+a can be computed based on Sk and Ci 
(or, equivalently, based on Sk-i = (ck- 2 ,Ck-i,Ck) and ci, cf. Fact 2.3.1b) in a 
single iteration, as opposed to the two iterations in Algorithm 2.5. For greater 
generality, it is shown how Cbk+a£ is computed, based on Ck, ct, Ck-t, and Ck- 2 £- 
A rough outline of the new XTR double exponentiation method is as follows. 
Let u = k, V = £, d = b, and e = a. It follows that ud + ve = bk + a£ and that 
c„, Vy, c„_„, and c„_ 2 « are known. The values of d and e are decreased, while 
at the same time u and v (and thereby c„, Cy, Cy-v, and Cu- 2 v) are updated, 
in order to maintain the invariant ud + ve = bk + a£. The changes in d and 
e are effected in such a way that at a given point d = e. But if d = e, then 
bk + a£ = ud + ve = d{u + v), so that Cbk+a£ follows by computing c„+„ and next 
Cd(u+v) (cf. Fact 2.3.1c). 

There are various ways in which d and e can be changed. The most efficient 
method to date was proposed by P.L. Montgomery in [15], for the computation 
of second degree recurrent sequences. The method below is an adaptation of [15, 
Table 4] to the present case of third degree sequences. 



3.1 Simultaneous XTR Double Exponentiation. Let a, b, Ck, ci, Ck-i 
and Ck- 2 i be given, with 0 < a, 6 < g. To compute Cbk+at do the following. 
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1. Let u — kj V — d — 6, 6 — Cy — c^, Cy—y — Cy—2y — 

/2 = 0, and f 3 = 0{u and v are carried along for expository purposes only). 

2. As long as d and e are both even, replace {d, e) by {d/2, e/2) and /2 by /2 + 1- 

3. As long as d and e are both divisible by 3, replace (d,e) by {d/3, e/3) and 
fs by /s + 1. 

4. As long as d yf e replace (d, e, u, u, c„, c^, c„_„, c„_ 2 «) by the 8-tuple given 
below. 

a) If d > e then 

i. if d < 4e, then {e,d — e,u + v, u, Cy+y, c„, Cy,Cy-y). 

ii. else if d is even, then (|, e, 2u, u, C 2 „, c„, C 2 „_„, C 2 („_^)). 

iii. else if e is odd, then e, 2u, u + v, C 2 y, c„+„, c_ 2 w). 

iv. optional: 

else if d = e mod 3, then (^, e, 3 u,m -I- u,C 3 „, c„+„, C 2 „_„, c„_ 2 «). 

V. else (e is even), then (|, d, 2u, m, C 2 „, c„, C 2 „_„, C 2 (^_„)). 

b) Else (if e > d) 

i. if e < 4d, then {d, e — d,u + v, v, Cy+y,Cy,Cy, c„_„). 

ii. else if e is even, then (|, d, 2u, u, C 2 t,, c„, C 2 t,-«, C 2 („_„)). 

iii. else if d is odd, then d, 2v, u + v, C 2 y, Cy+y, Cy-y, c_ 2 u)- 

iv. optional: 

else if e = 0 mod 3, then (|, d, 3u, m, C 3 „, c„, C 3 „_„, C 3 „_ 2 «). 

V. optional: 

else if e = d mod 3, then (^, d, 3v,u + u, C 3 „, c„+„, C 2 „_„, c^_ 2 „). 

vi. else (d is even), then (f , e, 2u, u, C 2 u, c„, C 2 „_«, C 2 („_„)). 

5. Apply Fact 2.3.1b to c„, Cy, c„_„, and Cy- 2 y, to compute ci = 

6. Apply Algorithm 2.4 to S'! = (3, ci,ci — 2c^), ci, and the binary represen- 
tation of d, resulting in Cd = Cd(y+y) (cf. Fact 2.3.1c). Alternatively, and on 
average faster, apply Algorithm 5.1 described below to compute Cd = Cd(y+y) 
based on ci (note that this results in a recursive call to Algorithm 3.1). 

7. Compute C 2 f 2 d{u+y) based on Cd(u+y) by applying Fact 2.3.2a /2 times. 

8. Compute c^f 32 f 2 d(u+v) based on C 2 f 2 d(y+y) by applying Fact 2.3.2b /s times. 

The asymmetry between Steps 4a and 4b is caused by the asymmetry between 
u and V, i.e., Cy- 2 y is available but c^_ 2 « is not. As a consequence, the case ‘d = 
0 mod 3’ is slower than the case ‘e = 0 mod 3’ (Step 4(b)iv), and its inclusion 
would slow down Algorithm 3.1. 

Steps 4(a)i and 4(b)i each require a single application of Fact 2.3.1b at the 
cost of three multiplications in Fp. Steps 4(a)v and 4(b)ii each require two appli- 
cations of Fact 2.3.2a at the cost of 2 -I- 2 = 4 multiplications in Fp. Steps 4(a)ii, 
4(a)iii, 4(b)iii, and 4(b)vi each require an application of Fact 2.3.1b and two 
applications of Fact 2.3.2a at the cost of 34-2-1-2 = 7 multiplications in Fp. The 
three optional steps 4(a)iv, 4(b)iv, and 4(b)v each require two applications of 
Fact 2.3.1b and one application of Fact 2.3.2b for a total cost of 3-I-3-I-4.5 = 10.5 
multiplications in Fp. 

In Table 1 the number of multiplications in Fp required by Algorithm 3.1 is 
given, both with and without optional steps 4(a)iv, 4(b)iv, and 4(b)v. Each set 
of entries is averaged over the same collection of 2^° randomly selected t’s, a’s. 
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and 6’s, with t of the size specified in Table 1 and a and b randomly selected from 
{1,2,. ..,t — 1}. For regular double exponentiation t ^ q, but t « ^/q for the 
application in Section 4. It follows from Table 1 that inclusion of the optional 
steps leads to an overall reduction of more than 6% in the expected number of 
multiplications in Fp. For the optional steps it is convenient to keep track of the 
residue classes of d and e modulo 3. These are easily updated if any of the other 
steps applies, but require a division by 3 if either one of the optional steps is 
carried out. It depends on the implementation and the platform whether or not 
an overall saving is obtained by including the optional steps. In most software 
implementations it will most likely be worthwhile. 



Table 1. Empirical performance of Algorithm 3.1, with 0 < a, b < t. 



multiplications in Fp 





including steps 4( 


a)iv, 4(b)iv, and 4(b)v| 


without steps 4(a 


■)iv, 4(b)iv, and 4(b)v| 


riog2 

^ T 


average 


standard 
deviation <7 


oj-Ar 


average 


standard 
deviation g 


aj-AF 


60 


350.01 = 5.83T 


20.5 ^ 0.34T 


2.65 


372.89 = 6.21T 


30.0 ^ 0.50T 


3.88 


70 


410.42 = 5.86T 


22.2 ^ 0.32T 


2.65 


437.41 ^ 6.25T 


32.6 ^ 0.47T 


3.89 


80 


470.84 = 5.89T 


23.7 ^ 0.30T 


2.65 


501.94 ^ 6.27T 


34.8 ^ 0.44T 


3.90 


90 


531.21 = 5.90T 


25.2 ^ 0.28T 


2.66 


566.36 - 6.29T 


37.0 ^ 0.41T 


3.90 


100 


591.63 = 5.92T 


26.5 ^ 0.27T 


2.65 


630.85 - 6.31T 


39.1 ^ 0.39T 


3.91 


110 


652.03 = 5.93T 


27.8 ^ 0.25T 


2.65 


695.40 - 6.32T 


41.1 ^ 0.37T 


3.92 


120 


712.39 = 5.94T 


29.1 ^ 0.24T 


2.66 


759.87 - 6.33T 


43.0 ^ 0.36T 


3.93 


130 


772.78 = 5.94T 


30.2 ^ 0.23T 


2.65 


824.31 ^ 6.34T 


44.6 ^ 0.34T 


3.92 


140 


833.19 = 5.95T 


31.5 ^ 0.22T 


2.66 


888.91 - 6.35T 


46.4 ^ 0.33T 


3.92 


150 


893.66 = 5.96T 


32.5 ^ 0.22T 


2.65 


953.34 ^ 6.36T 


48.1 ^ 0.32T 


3.93 


160 


953.98 = 5.96T 


33.6 ^ 0.21T 


2.66 


1017.79 - 6.36T 


49.7 ^ 0.31T 


3.93 


170 


1014.42 = 5.97T 


34.7 ^ 0.20T 


2.66 


1082.36 ^ 6.37T 


51.3 ^ 0.30T 


3.93 


180 


1074.84 = 5.97T 


35.7 ^ 0.20T 


2.66 


1146.88 ^ 6.37T 


52.7 ^ 0.29T 


3.93 


190 


1135.19 = 5.97T 


36.6 ^ 0.19T 


2.66 


1211.34 ^ 6.38T 


54.3 ^ 0.29T 


3.94 


200 


1195.58 = 5.98T 


37.6 ^ 0.19T 


2.66 


1275.82 ^ 6.38T 


55.7 ^ 0.28T 


3.94 


210 


1256.05 = 5.98T 


38.5 ^ 0.18T 


2.66 


1340.23 ^ 6.38T 


57.1 ^ 0.27T 


3.94 


220 


1316.42 = 5.98T 


39.5 ^ 0.18T 


2.66 


1404.75 - 6.39T 


58.5 ^ 0.27T 


3.94 


230 


1376.87 = 5.99T 


40.3 ^ 0.18T 


2.66 


1469.36 - 6.39T 


59.7 ^ 0.26T 


3.94 


240 


1437.25 = 5.99T 


41.2 ^ 0.17T 


2.66 


1533.89 - 6.39T 


61.1 ^ 0.25T 


3.94 


250 


1497.61 = 5.99T 


42.0 ^ 0.17T 


2.66 


1598.22 ^ 6.39T 


62.3 ^ 0.25T 


3.94 


260 


1558.00 = 5.99T 


42.9 ^ 0.17T 


2.66 


1662.80 ^ 6.40T 


63.7 ^ 0.24T 


3.95 


270 


1618.47 = 5.99T 


43.8 ^ 0.16T 


2.66 


1727.31 ^ 6.40T 


64.9 ^ 0.24T 


3.95 


280 


1678.74 = 6.00T 


44.5 ^ 0.16T 


2.66 


1791.85 - 6.40T 


66.1 ^ 0.24T 


3.95 


290 


1739.17 = 6.00T 


45.3 ^ 0.16T 


2.66 


1856.32 ^ 6.40T 


67.2 ^ 0.23T 


3.94 


300 


1799.57 = 6.00T 


46.1 = 0.15T 


2.66 


1920.88 = 6.40T 


68.4 = 0.23T 


3.95 



Conjecture 3.2 Given integers a and b with 0 < a,b < q and trace values Ck, 
C£, Ck-i, and Ck- 2 i, the trace value Cbk+ai can on average he computed in about 
6 log 2 (max(a, &)) multiplications in Fp using Algorithm 3.1. 

It follows that XTR double exponentiation using Algorithm 3.1 is on average 
faster than the XTR single exponentiation from [10] (given in Algorithm 2.4), 
and more than twice as fast as the previous methods to compute Cbk+ai ([10, 
Algorithm 2.4.8 and Theorem 2.4.9] and Algorithm 2.5). An additional advan- 
tage of Algorithm 3.1 is that, like Algorithm 2.5, it does not require matrices. 
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These advantages have considerable practical consequences, not only for the 
performance of XTR signature verification (Section 6), but also for the accessi- 
bility and ease of implementation of XTR. In Sections 4 and 5 consequences of 
Algorithm 3.1 for XTR single exponentiation are given. 

Based on Table 1 the expected practical behavior of Algorithm 3.1 is well 
understood, and the practical merits of the method are beyond doubt. However, 
a satisfactory theoretical analysis of Algorithm 3.1, or the second degree original 
from [15], is still lacking. The iteration in Algorithm 3.1 is reminiscent of the 
binary and subtractive Euclidean greatest common divisor algorithms. Iterations 
of that sort typically exhibit an unpredictable behavior with a wide gap between 
worst and average case performance; see for instance [1,7,19] and the analysis 
attempts and open problems in [15]. 

This is further illustrated in Figure 1. There the average number of multi- 
plications for ]"log 2 t] = 170 is given as a function of the value of the constant 
in Steps 4(a)i and 4(b)i of Algorithm 3.1. The value 4 is close to optimal and 
convenient for implementation. However, it can be seen from Figure 1 that a 
value close to 4.8 is somewhat better, if one’s sole objective is to minimize the 
number of multiplications in Fp, as opposed to minimizing the overall runtime. 
The curves in Figure 1 were generated for constants ranging from 2 to 8 with 
stepsize 1/16, per constant averaged over the same collection of randomly 
selected t’s, a’s, and b’s. The remarkable shape of the curves - both with at least 
four local minima - is a clear indication that the exact behavior of Algorithm 3.1 
will be hard to analyse. It is of no immediate importance for the present paper 
and left as a subject for further study. 

Remark 3.3 As shown in Appendix A other small improvements can be ob- 
tained by distinguishing more different cases than in Algorithm 3.1. The version 
presented above represents a good compromise that combines reasonable over- 
head with decent performance. In practical circumstances the performance of 
Algorithm 3.1 is on average close to optimal. 

Remark 3.4 If Algorithm 3.1 is implemented using the slower field arithmetic 
from [10, Lemma 2.1.1], as opposed to the improved arithmetic from 2.1, it 
can on average be expected to require 7.41og2(max(a, &)) multiplications in Fp. 
This is still more than twice as fast as the method from [10] (using the slower 
arithmetic), but more than 20% slower than Conjecture 3.2. 

Remark 3.5 Unlike the XTR exponentiation methods from [10], different in- 
structions are carried out by Algorithm 3.1 for different input values. This makes 
Algorithm 3.1 inherently more vulnerable to environmental attacks than the 
methods from [10] (cf. [10, Remark 2.3.9]). If the possibility of such attacks is a 
concern, then utmost care should be taken while implementing Algorithm 3.1. 

4 Single Exponentiation with Precomputation 

Suppose that for a fixed ci several c„’s for different m’s, with 0 < u < q, have to 
be computed. In this section it is shown that, after a small amount of precom- 
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Fig. 1. Dependence on the valne of the constant. 



putation, this can be done using Algorithm 3.1 in less than half the number of 
multiplications in Fp that would be required by Algorithm 2.4. 

Let t = 2 rd°S 2 9)/2l^ suppose that St-i = (ct- 2 ,ct-i,ct) has been pre- 
computed based on ci. For any u € {0, 1> ■ • ■ ) 9 ~ 1} non-negative integers a and 
b of at most 1 -I- (log 2 q)/2 bits can simply be computed such that u = bt + a. 
Given St-i and c\, the value c„ can then be computed using Algorithm 3.1 with 
k = t and £ = 1. This leads to the following precomputation and XTR single 
exponentiation with precomputation. 

4.1 Precomputation. Let Ci be given. To precompute values t and St-i = 
(ct- 2 , Ct-i, Ct) do the following. 

1. Let t = 2 rd°S 2 <?)/ 2 l^ y — — and let Vr-i,Vr-2, ■ ■ ■ ,vq be the binary 

representation of v (so Vi = 1 for 0 < t < r for t = 2 [*°S 2 9)/2])^ 

2. Apply Algorithm 2.4 to Si = (3,ci,cf — 2c^), ci, and Vr-i,Vr- 2 , ■ ■ ■ ,vo to 
compute S 2 V +1 = St-i- 

The value St-i computed by Algorithm 4.1 consists of the traces of three consec- 
utive powers of the subgroup generator corresponding to Ci. Algorithm 4.1 takes 
essentially a single application of Algorithm 2.4, and thus about 3.51og2 g multi- 
plications in Fp, since log 2 t « (log2<;)/2. Improved XTR single exponentiation 
Algorithm 5.1 given below would require more than a single application, because 
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it produces just the trace of a single power, and not its two ‘nearest neighbors’ 
as well. With [11, Theorem 5.1], which for most t’s allows fast computation of 
Ct+i given ci, Ct-i, and Ct, two applications of Algorithm 5.1 would suffice. But 
that is still expected to be slower than a single application of Algorithm 2.4, as 
follows from Corollary 5.3. 

4.2 XTR Single Exponentiation with Precompntation. Let u, ci, t, and 

St-i be given, with 0 < u < g. To compute c„, do the following. 

1 . Compute non-negative integers a and b such that u = bt + a mod q and a 
and b are at most about ^/q: 

— If log 2 (t mod q) « (log 2 q)/2 (as in 4.1), then use long division to com- 
pute a and b such that u = b{t mod q) + a. 

— Otherwise, use the lattice-based method described in 4.4. With the 
proper choice of t this results in a and b that are small enough. 

2. If 6 = 0, then compute Ca = c„ using either Algorithm 2.4 or Algorithm 5.1, 
based on ci . 

3. Otherwise, if a = 0, then compute Cb = Ctb = c„ using either Algorithm 2.4 
or Algorithm 5.1, based on ci = Cj. 

4. Otherwise, if a yf 0 and & yf 0, then do the following: 

~ Let k = t, £ = 1, so that St-i = (ck- 2 i,Ck-i,Ck) and Ci = ci. 

~ Use Algorithm 3.1 to compute Cbk+ai = c„ based on a, b, Ck, C£, Ck-i, 
and Ck- 2 i- 



Obviously, any t of about the same size as ^yq will do. A power of 2, however, 
facilitates the computation of a and b in Step 1 of Algorithm 4.2. Algorithm 4.2 
allows easy implementation and, apart from the precomputation, the perfor- 
mance overhead on top of the call to Algorithm 2.4, 5.1, or 3.1 is negligible. The 
expected runtime of Algorithm 4.2 follows from Conjecture 3.2. 

Corollary 4.3 Given integers u and t with 0 < u < q and log 2 t « (log2<7)/2 
and trace values c\, Ct, Ct-i, and Ct-2, the trace value c„ can on average be 
computed in about 31og2U multiplications in Fp using Algorithm J^.2. 

This is more than 60% faster than Algorithm 2.4 as described in [10] using the 
slower field arithmetic. It can be used in the first place by the owner of the XTR 
key containing c\. Thus, XTR signature generation can on average be done more 
than 60% faster than before [10, Section 4.3]. It can also be used by shared users 
of an XTR key, such as in Diffie-Hellman key agreement. However, it only affects 
the first exponentiation to be carried out by each party: party A’s computation 
of Co given c\ and a random a can be done on average more than 60% faster, 
but the computation of Cab based on the value Cb received from party B is not 
affected by this method. See Section 5 how to speedup the computation of Cab 
as well. 

The precomputation scheme may also be useful for XTR-ElGamal encryption 
[10, Section 4.2]. In XTR-ElGamal encryption the public key contains two trace 
values. Cl and c^, where k is the secret key. The sender (who does not know k) 
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picks a random integer b, computes Cb based on ci, computes Cbk based on Ck, uses 
Cbk to (symmetrically) encrypt the message, and sends the resulting encryption 
and Cb to the owner of k. If the sender uses XTR-ElGamal encryption more than 
once with the same ci and Ck, then it is advantageous to use precomputation. 
In this application two precomputations have to be carried out, once for ci and 
once for Ck- The recipient has to compute Cbk based on the value Cb received (and 
its secret k). Because Cb will not occur again, precomputation based on Cb does 
not make sense for the party performing XTR-ElGamal decryption. 

4.4 Fast Precomputation. It is shown that the choice t = p leads to a faster 
precomputation, while only marginally slowing down Step 1 of Algorithm 4.2. 
The triple Sp-i = (cp_2, Cp_i, Cp) follows from Cp = (Fact 2.3.1a), Cp_i = Ci 
(because if g is a root with trace Ci, then is one of its conjugates and 

has the same trace), and from the fact that, according to [12, Proposition 5.7], 
Cp_2 can be computed at the cost of a square-root computation in Fp. Here it is 
assumed that the public key containing p, q, and ci contains an additional single 
bit of information to resolve the square-root ambiguity^ . Thus, if p = 3 mod 4 
recipients of XTR public key data with p and q of the above form can do the 
precomputation of Sp-i at a cost of at most « 1.31og2P multiplications in Fp, 
assuming the owner of the key sends the required bit along. The storage overhead 
(on top of Cl ) for Sp - 1 is just a single element of Fp2 , as opposed to three elements 
for St-i as in 4.1. 

If p mod q « then non-negative a and b of order about ydj in Step 1 of 
Algorithm 4.2 can be found at the cost of a division with remainder. This is, for 
instance, the case if p and q are chosen a,s d + 1 and d — r + 1, respectively, 
as suggested in [10, Section 3.1]. However, usage of such primes p and q is not 
encouraged in [10] because of potential security hazards related to the use of 
primes p of a ‘special form’. 

Interestingly, and perhaps more surprisingly, sufficiently small a and b exist 
and can be found quickly in the general case as well. Let L be the two-dimensional 
integral lattice {(ci, 62)^ G : ci -I- C2P = 0 mod q}. If (ci, 62)^ G L, then 

(ci -I- 62) — cip = — C2P -1-62-1- 62P^ = 62 (p^ — P -I- 1) = 0 mod q 

so that (ei -I- 62, — ei)^ G L. Let v\ = (61,62)^ be the shortest non-zero vector 
of L (using the L2-norm). It may be assumed that 6i > 0. It follows that 62 > 0, 
because otherwise (ei-|-e2, — ei)^ or (—62, 61-1-62)^ G L would be shorter than v\. 
If V 2 is the shortest of (ei -I- 62, — ei)^, (—62, 6i -I- 62)^ G L, then |u2| < 2|ui] and 
{vi,V 2 } is easily seen to be a shortest basis for L, with ef -I- 6162 -I- 63 = q and 
61,62 < ^/q■ This implies that given {vi,V 2 } and any integer vector (— m, 0)^, 
there is a vector (a,b)'^ with 0 < a, & < 2^ such that {—u+ a,b)^ G L. It 
follows that —u + a + bp = Q mod q, i.e., u = bp + a mod q as desired. Using the 
initial basis {(g, 0)^, (— p, 1)^}, the vector vi can be found quickly [3, Algorithm 

^ The statement in [12, Proposition 5.7] that this requires a square-root computation 
in Fp2 , as opposed to Fp, is incorrect. This follows immediately from the proof of [12, 
Proposition 5.7]. 
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1.3.14], and for any u the vector (a, 6)^ can easily be computed. In [6, Section 4] a 
similar construction was independently developed for ECC scalar multiplication. 



Corollary 4.5 Given an integer u with 0 < u < q and trace values c\ and Cp- 2 , 
the trace value c„ can on average he computed in about 3 log 2 u multiplications 
in Fp using Algorithm 4^.2. 

The owner of the key must explicitly compute Cp _2 in order to compute the 
ambiguity-resolving bit. Thus, the owner cannot take advantage of fast precom- 
putation. This adds a minor cost to the key creation. 

5 Improved Single Exponentiation 

In this section it is shown how Algorithm 3.1 can be used to obtain an XTR 
single exponentiation method that is on average more than 25% faster than 
Algorithm 2.4. That is 35% faster than the single exponentiation from [10] based 
on the slower field arithmetic. Using Algorithm 3.1 to obtain an on average faster 
XTR single exponentiation is straightforward: to compute c„ with Q < u < q 
based on ci just apply Algorithm 3.1toA: = ^=l and any positive a,b with 
a + b = u, then a speedup of more than 14% over Algorithm 2.4 can be expected 
according to Table 1. 

The 25% faster method uses this same approach, but exploits the freedom of 
choice of a and b: if a and b, i.e., d and e in Algorithm 3.1, can be selected in such 
a way that the iteration in Step 4 of Algorithm 3.1 favors the ‘cheap’ steps, while 
still quickly decreasing d and e, then Algorithm 3.1 should run faster than for 
randomly selected a and b. Given the various substeps of Step 4 of Algorithm 3.1 
and the associated costs, a good way to split up u in the sum of positive a and 
b seems to be such that b/a is close to the golden ratio </> = , i.e., the 

asymptotic ratio between two consecutive Fibonacci numbers. This can be seen 
as follows. If the initial ratio between d and e is close to (f), then Step 4(a) i applies 
and d,e is replaced by e,d— e. This corresponds to a ‘Fibonacci-step back’ so 
that the ratio between the new d and e (i.e., e and d — e) can again be expected 
to be close to (p. Furthermore, the sum of d and e is reduced by a factor p, 
which is a relatively good drop compared to the low cost of Step 4(a)i (namely, 
three multiplications in Fp). This leads to the following improved XTR single 
exponentiation. 

5.1 Improved XTR Single Exponentiation. Let u and Ci be given, with 
0 < M < g. To compute c„, do the following. 

1. Let a = round( ^~ 2 ‘^ M) and b = u — a (where round (x) is the integer closest 
to x). As a result b/a « (/> as above. 

2. Let k = £ = 1, Ck = ce = Cl, Ck-i = cq = 3, Ck~ 2 t = c_i = cf (cf. 
Fact 2.3.1a). 

3. Apply Algorithm 3.1 to a, b, Ck, eg,, Ck-i, and Ck- 2 i, resulting in Cbk+at = c«. 
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Proposition 5.2 In the call to Algorithm 3.1 in Step 3 of Algorithm 5.1, the 
values of d and e in Step 4 of Algorithm 3.1 are reduced to approximately half 
their original sizes using a sequence of approximately log^ -^/u iterations using 
just Step 4(0')i- 

Proof. Let m = round(log 0 u). Asymptotically for m —>■ oo the values a and b in 
Algorithm 5.1 satisfy b/a = 4>+ei with |ei| = 0(2“™). Furthermore, for n ^ oo, 
the n-th Fibonacci number F„ satisfies = 4> + £2 with |e 2 | = 0(2“”). It 

follows that a = ^^~^ b+ £ 3 , where [esl is bounded by a small positive constant. 

Define (do,eo) = (b,a) and (di,ei) = (ei-i,di-i — e^-i) for i > 0. With 
induction it follows from a = b+ €3 that 

(2) d* = 

^ m 

for 0 < i < m. Algorithm 3.1 as called from Algorithm 5.1 will perform Fibonacci 
steps as long as Ci < di < 2ci. But as soon as di > 2a this nice behavior will be 
lost. From = di+\ and (2) it follows that di > 2ci is equivalent to 

^ m 

Because Fm/b and |e 3 | are both bounded by small positive constants, the first 
time this condition will hold is when Fm-i -3 and Fi +3 are of the same order of 
magnitude, i.e., m — i — 5 « i + 3. Thus, the Fibonacci behavior is lost after 
about m /2 = log^ y/u iterations, at which point di « y/u (this follows from ( 2 )). 
This completes the proof of Proposition 5.2. 

Based on Proposition 5.2, a heuristic average runtime analysis of Algorithm 5.1 
follows easily. The Fibonacci part consists of about log^ y/u iterations consisting 
of just Step 4(a)i of Algorithm 3.1, at a total cost of 31og0-/u « 2 . 21 og 2 W 
multiplications in Fp. Once the Fibonacci behavior is lost, the remaining d and 
e are assumed to behave as random integers of about the same order of magnitude 
as y/u, so that, according to Conjecture 3.2, the remainder can on average be 
expected to take about 61 og 2 y/u = 31og2 u multiplications in Fp. 

Corollary 5.3 Given an integer u with 0 < u < q and a trace value a, the 
trace value c„ can on average be computed in about 5.2 log 2 u multiplications in 
Fp using Algorithm 5.1. 

This corresponds closely to the actual practical runtimes. It is more than 25% 
better than Algorithm 2.4. Without the optional steps in Algorithm 3.1 the 
speedup is reduced to about 22 %. 

Remark 5.4 If insufficient precision is used in the computation of a and b in 
Step 1 of Algorithm 5.1, then £3 in the proof of Proposition 5.2 is no longer 
bounded by a small constant. It follows that di > 2ci already holds for a smaller 
value of i, implying that the Fibonacci behavior is lost earlier. A precise analysis 
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of the expected performance degradation as a function of the lack of precision 
is straightforward. In practice this effect is very noticeable. 

If a and b happen to be such that all steps are Fibonacci steps, then the cost 
would be 4.3 log 2 u. This is fewer than log 2 u multiplications in Fp better than 
the average behavior obtained. 



6 Timings 

To make sure that the methods introduced in this paper actually work, and to 
discover their runtime characteristics, all new methods were implemented and 
tested. In this section the results are reported, in such a way that the results 
can easily and meaningfully be compared to the timings reported in [10]. 

Algorithm 2.5 was implemented, tested for correctness, and it was confirmed 
that the speedup over the double exponentiation from [10] is negligible. However, 
implementing Algorithm 2.5 was shown to be significantly easier than it was for 
the matrix-based method from [10]. Thus, Algorithm 2.5 may still turn out to 
be valuable if Algorithm 3.1 cannot be used (Remark 3.5). 

The methods from Sections 3, 4, and 5 were implemented as well, and incor- 
porated in cryptographic XTR applications along with the old methods from [10]. 
The resulting runtimes are reported in Table 2. Each runtime is averaged over 
100 random keys and 100 cryptographic applications (on randomly selected data) 
per key. The timings for the XTR single exponentiations with precomputation 
do not include the time needed for the precomputations. The latter are given in 
the last two rows. All times are in milliseconds on a 600 MHz Pentium HI NT 
laptop, and are based on the use of a generic and not particularly fast software 
package for extended precision integer arithmetic [8]. More careful implementa- 
tion should result in much faster timings. The point of Table 2 is however not 
the absolute speed, but the relative speedup over the methods from [10]. 

The RSA timings are included to allow a meaningful interpretation of the 
timings: if the RSA signing operation runs x times faster using one’s own soft- 
ware and platform, then most likely XTR will also run x times faster compared 
to the figures in Table 2. For each key an odd 32-bit RSA public exponent was 
randomly selected. ‘CRT’ stands for ‘Chinese Remainder Theorem’. For a theo- 
retical comparison of the runtimes of RSA, XTR, ECC, and various other public 
key systems at several security levels, refer to [9]. 



Table 2. RSA, old XTR, and new XTR runtimes. 



method 


key selection 


signing 


verifying 


encrypting 


decrypting 


1020-bit RSA 


with CRT 


90S ms 


40 ms 


5 ms 


5 ms 


40 ms 


without CRT 




123 ms 


123 ms 


170-bit XTR 


old 


64 ms 


10 ms 


21 ms 


21 ms 


10 ms 




new, no precomputation 


62 ms 


7.3 ms 


8.6 ms 


15 ms 


7.3 ms 




new, with precomputation 




4.3 ms 




8.6 ms 






precomputation 4.1 




4.4 ms 




8.8 ms 






tast precomputation 4.4 




1.6 ms 




6.0 ms 
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7 Application to LUC and ECC 

The exponentiations in LUC [18] and ECC when using the curve parameteri- 
zation proposed in [14] can be evaluated using second degree recurrences. For 
LUC this is described in detail in [15]. For ECC it is described in [16] and fol- 
lows by combining [14] and [15]. For ease of reference the resulting runtimes are 
summarized in this section. 

7.1 LUC. Let p and q be primes such that q divides p + I, and let g be a 
generator of the order q subgroup of F*a . In LUC elements of (g) are represented 
by their trace over Fp. Let G Fp denote the trace over Fp of g”. 

Conjecture 7.2 (cf. Conjecture 3.2) Given integers a and b with 0 < a,b < q 
and trace values Vk, vg, and Vk~e, the trace value Vbk+ai can on average be 
computed in about 1.491og2(max(a, &)) multiplications and 0.33 log 2 (max(a, &)) 
squarings in Fp, using the method implied by [15, Table jj. 

Corollary 7.3 (cf. Corollary j.3) Given integers u and t with 0 < u < q and 
log 2 t ~ (log 2 g )/2 and trace values v\, vt, and vt-i, the trace value can on 
average be computed in about 0.75 log 2 u multiplications and 0.17 log 2 u squarings 
in Fp using a generalization of Algorithm j.2. 

Corollary 7.4 (cf. Corollary 5.3) Given an integer u with 0 < u < q and a 
trace value vi, the trace value can on average be computed in about 1.471og2 u 
multiplications and 0.171og2M squarings in Fp using a generalization of Algo- 
rithm 5.1. 

7.5 ECC. Let E be an elliptic curve over a prime field Fp, let F(Fp) be the 
group of points of E over Fp, and let G G E{Fp) be a point of prime order q. 
As usual, the group operation in E(Fp) is written additively. 

Conjecture 7.6 (cf. Conjecture 3.2) Given integers a and b with 0 < a,b < q 
and points kG, £G, and {k — £)G, the x-coordinate of the point {bk a£)G can 
on average be computed in approximately 71og2(max(a, 6)) multiplications and 
3.71og2(max(a, 6)) squarings in Fp, using the method implied by [15, Table j[ 
combined with the elliptic curve parameterization from [lj[. 

Corollary 7.7 (cf. Corollary j.3) Given integers u and t with 0 < u < q and 
log 2 t « (log 2 g)/2 and points G, tG, and {t — 1)G, the x-coordinate of the point 
uG can on average be computed in about 3.51og2tt multiplications and 1.81og2M 
squarings in Fp using a generalization of Algorithm j.2. 

Corollary 7.8 (cf. Corollary 5.3) Given an integer u with 0 < u < q and a 
point G, the x-coordinate of the point uG can on average be computed in about 
6.41og2 u multiplications and 3.31og2 u squarings in Fp using a generalization of 
Algorithm 5.1. 
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The single scalar multiplication algorithms are competitive with the ones de- 
scribed in the literature [5]. The double scalar multiplication algorithm from [16] 
(and as slightly adapted to obtain Conjecture 7.6) is substantially better than 
other ECC double scalar multiplication methods reported in the literature [2]. 
For appropriate elliptic curves Corollary 7.7 can be combined with the method 
proposed in [6], so that the runtime of Corollary 7.7 would hold for Corollary 7.8. 

8 Conclusion 

The XTR public key system as published in [10] is one of the fastest, most com- 
pact, and easiest to implement public key systems. In this paper it is shown that 
it is even faster and easier to implement than originally believed. The matrices 
from [10] can be replaced by the more general iteration from Section 3. This re- 
sults in 60% faster XTR signature applications, substantially faster encryption, 
decryption, and key agreement applications, and more compact implementations. 

Acknowledgment. The authors thank Peter Montgomery from Microsoft Re- 
search whose remarks [16] stimulated this research. 
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Almost 2% can be saved compared to Algorithm 3.1 by distinguishing more cases 
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4. As long as d yf e replace {d,e,u,v,Cu,Cv,Cu-v,Cu- 2 v) by the 8-tuple given 
below. 

a) If d > e then 

i. if d < 5.5e, then {e,d — e,u -\- v, u, c„+„, c„, c^, Cy-u)- 

ii. else if d and e are odd, then e,2u,u -h v, C 2 u, Cu+vy Cu-v, C- 2 v)- 

iii. else if d < 6.4e, then (e, d — e, u -I- v, u, Cu+v, c„, Cy,Cy-u)- 

iv. else if d = e mod 3, then 3u, u-hv, C 3 „, c„+„, C 2 „-«, c„_ 2 «). 

V. else if d is even, then {^,e,2u,v,C2u,Cy,C2u-v,C2{u-v))- 

vi. else if d < 7.5e, then (e,d — e,u -\- v, u, Cy+y, c„, c„, c^_„). 

vii. else if de = 2 mod 3, then , e,3u,2u -\- w, C 3 „, C 2 «+^, c„_„, 

C-—U — 2V ) ■ 

viii. else (e is even), then {^,d, 2 v,u,C 2 v,Cu,C 2 v-u,C 2 (v-u))- 

b) Else (if e > d) 

i. if e < 5.5d, then {d, e — d,u -\- v, v, c„+^, c„, c„, Cy-v)- 

ii. else if e is even, then (|, d, 2u, w, C 2 t,, c„, C 2 t,-«, C 2 („_„)). 

iii. else if e = d mod 3, then (^^, d, 3v, u-\- v, c^y, Cy+y, C 2 y~u, Cy- 2 y)- 

iv. else if de = 2 mod 3, then (d, ,u-\-2v, 3v, Cy+ 2 v,C 3 v, Cy-y, Cy- 4 y). 

V. else if e < 7.4d, then {d,e — d,u -\- v, v, c„+„, c„, c„, €„_„). 

vi. else if d is odd, then (^^, d, 2v, u-\- v, C 2 v, Cy+y, c„_„, c_ 2 u). 
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Steps 4(a)vii and 4(b)iv require 13.5 and 12.5 multiplications in Fp, respectively. 
The cost of the other steps is as in Section 3. The average cost to compute 
Cbk+at turns out to be about 5.9 log 2 (max(a, 6)) multiplications in Fp. Omission 
of Steps 4(a)iii, 4(a)vi, and 4(b)v, combined with a constant 4 instead of 5.5 in 
Steps 4(a)i and 4(b)i leads to an almost 1% speedup over Algorithm 3.1. 
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Abstract. We implement various computations in the braid groups 
via practically efficient and theoretically optimized algorithms whose 
pseudo-codes are provided. The performance of an actual implementa- 
tion under various choices of parameters is listed. 



1 Introduction 

A new cryptosystem using the braid groups was proposed in [5] at Crypto 2000. 
Since then, there has been no serious attempt to analyze the system besides 
one given by inventors [7]. We think that this is because the braid group is 
not familiar to most of cryptographers and cryptanalysts. The primary purpose 
to announce our implementation is to encourage people to attack the braid 
cryptosystem. In [7], a necessary condition for the instances of the mathematical 
problem which the braid cryptosystem is based on is found so that it makes the 
mathematical problem intractable. This means that a key selection is crucial 
to maintain the theoretical security of the braid cryptosystem. Thus the key 
generation is one of the areas where much research is required and we think that 
the search for strong keys should be eventually aided by computers. This is the 
secondary purpose of our implementation. 

In this paper we discuss implementation issues of the braid group given by 
either the Artin presentation [2] or the band-generator presentation [1]. Due to 
the analogy between the two presentations, our implementations on the two pre- 
sentations are basically identical, except the low-level layer consisting of data 
structures and algorithms for canonical factors, which play the role of the build- 
ing blocks for braids. Even though the algorithms of the present implementation 
in the braid groups are our initial work, they are theoretically optimized so that 
all of single operations can be executed at most in O(nlogn) where n is the 
braid index n that is the security parameter corresponding to the block sizes in 
other cryptosystems. This excellent speed is achieved because the canonical fac- 
tors are expressed as permutations that can be efficiently and naturally handled 
by computers. The efficiency of the implementation shows that the braid group 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 144-156, 2001. 
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is a good source of cryptographic primitives [5,6]. It is hard to think of any other 
non-commutative groups that can be digitized as efficiently as the braid group. 
Matrix groups are typical examples of non-commutative groups and in fact any 
group can be considered as a matrix group via representations. But the group 
multiplication in the braid group of index n is faster than the multiplication of 
(n X n) matrices. 

This paper is organized as follows. Section 2 is a quick review of the minimal 
necessary background on braid groups. In Section 3 and 4, we develop data struc- 
tures and algorithms for canonical factors and braids, respectively. In Section 5, 
we show how to generate random braids. In Section 6, we discuss the perfor- 
mance of our implementation, through the braid cryptosystems in [7]. Section 7 
is our conclusion. 



2 A Quick Review of the Braid Groups 

A braid is obtained by laying down a number of parallel strands and intertwining 
them so that they run in the same direction. In our convention, this direction is 
horizontally toward the right. The number of strands is called the braid index. 
The set of isotopy classes of braids of index n has a group structure, called 
the n-braid group, where the product of two braids x and y is nothing more 
than laying down the two braids in a row and then matching the end of x to the 
beginning of y. 

Any braid can be decomposed as a product of simple braids. One type of 
simple braids is the Artin generators Gi that have a single crossing between i-th 
and (f-l-l)-st strand as in Figure 1 (a), and the other type is the band-generators 
a±s that have a single half-twist band between t-th and s-th strand running over 
all intermediate strands as in Figure 1 (b). 

The n-braid group is presented by the Artin generators cti, . . . , (J„_i and 
relations UiOj = GjOi for \i — j| > 1 and UiOjai = OjUiOj for \i — j\ = 1. On the 
other hand, is also presented by the band-generators ats ior n > t > s > 1 
and relations Otsarq = Orqttts for {t — r){t — q){s — r)(s — q) > 0 and atsasr = 
atrdts = Gsratr for n>t>s>r>l. 

These will be called the Artin presentation and the band-generator presen- 
tation, respectively. There are theoretically similar solutions to the word and 
conjugacy problems in B„ for both presentations [1,2,3]. The band-generator 
presentation has a computational advantage over the Artin as far as the word 
problem is concerned. Since almost all the machineries are identical in the two 
theories, it will be convenient to introduce unified notation so that we may review 
both theories at the same time. 

1. Let be the monoid defined by the same generators and relations in a 
given presentation. Elements in B^ are called positive braids or positive 
words. The relations in the Artin and band-generator presentations preserve 
word-length of positive braids and so the word-length is easy to compute 
for positive braids. The natural map ^ B„ is injective. [1,4]. There 
are no known presentations of B„ except these two that enjoy this injection 
property needed for a fast solution to the word problem. 
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Fig. 1. Generators and fundamental braids 



2. There is a fundamental braid D. It is Z\ = (cti • • • • • • Un- 2 ) • • ■ <Ji 

in the Artin presentation and 5 = • • • 021 in the band- 

generator presentation as shown in Figure 1 (c), (d). The fundamental braid 
D can be written in many distinct ways as a positive word in both presen- 
tations. Due to this flexibility, it has two important properties: 

(i) For each generator a, D = aA = Ba for some A,Bg Bif. 

(ii) For each generator a, oD = DT(a) and Da = t“^(o)D where r is the 
automorphism of i?„ defined by r(CTi) = for the Artin presentation 
and T{ats) = a(t+i)(s+i) for the band-generator presentation. 

3. There are partial orders ‘<l’ and ‘<_r’ in i?„. For two words V and W 
in B„, we say that V > W (resp. V >l W, V >r W) \iV = PWQ (resp. 
V = WP, V = PW) for some P,Q & Bif. If a word is compared against 
either the empty word e or a power of D, all three orders are equivalent 
due to the property (ii) above. Note that the partial orders depend on a 
presentation of and IF is a positive word if and only if IF > e. 

4. For two elements V and IF in a partial order set, the meet V A IF(resp. 
join F V IF) denotes the largest (resp. smallest) element among all elements 
smaller (resp. larger) than V and IF. If both the meet and join always exist 
for any pair of elements in a partially order set, the set is said to have 
a combinatorial lattice structure. The braid group has a combinatorial 
lattice structure for ‘<z,’ and in any of both presentations [3,1]. When 
we want to distinguish the meet and join for left and right versions, we will 
use ‘Al’, ‘A_r’, ‘Vl’ and ‘Vl’. 

5. A braid satisfying e < A < D is called a canonical factor and [0, 1]„ denotes 
the set of all canonical factors in i?„. The cardinality of [0, 1]„ is nl for the 

(2n)! 

Artin presentation, and the Catalan number C„ = — ^ for the 

n!(n + 1)! 

band-generator presentation. Note that Cn is much smaller than n! and this 
is one of main reasons why it is sometimes computationally easier to work 
with the band-generator presentation than the Artin presentation. 

6. For a positive braid P, a decomposition P = AqPq is left-weighted if Ag G 
[0, 1]„, Pq P e, and Ag has the maximal length (or maximal in ‘<l’) among 
all such decompositions. A left-weighted decomposition P = AgPg is unique. 
Ag is called the maximal head of P. The notion ‘right-weighted’ can be also 
defined similarly. 

7. Any braid IF given as a word can be decomposed uniquely into 



IF = D“AiA2 • • • A* 



e < Ai < D, M e Z, 



( 1 ) 
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where the decomposition AiAi^i is left-weighted for each 1 < z < fc — 1. 
This decomposition, called the left canonical form of W, is unique and so it 
solves the word problem. The integer u (resp. u + k) is called the infimum 
(resp. supremum) of W and denoted by inf(lE) (resp. sup(lT)). The infimum 
(resp. supremum) of W is the smallest (resp. largest) integer m such that 
> e (resp. < e). The canonical length of W, denoted by len(lT), is 
given by fc = sup(Vb) — inf (IT) and will be used as an important parameter 
together with the braid index n. The right canonical form of W can be also 
defined similarly. 



3 Canonical Factors 

3.1 Data Structures 

A canonical factor in the Artin presentation of Bn can be identified with the as- 
sociated n-permutation, which is obtained by replacing the z-th Artin generator 
CTi by the transposition of z and z -|- 1 . We represent an zz-permutation as an array 
A of n integers, where A[i] is equal to the image of z under the permutation. A 
is called a permutation table. 

A canonical factor in the band-generator presentation is also uniquely de- 
termined by the associated permutation. Thus a canonical factor can be rep- 
resented by a permutation table as before, but a permutation is associated to 
a canonical factor in the band-generator presentation only if it is a product of 
“disjoint parallel descending cycles” [1]. Two descending cycles (siSi_i • • • Si) 
and {tjtj-i • • • ti), where Si > • • • > s\ and tj > ■ ■ ■ > t\, are called parallel if Sa 
and Sb do not separate tc and td (i.e. {sa — tc){sa—td){sb — tc){sb — td) is positive) 
for all 1 < a < 6 < z and 1 < c < d < j. Thus a canonical factor can also be 
represented by an array A of rz integers where X[i] is the maximum in the de- 
scending cycle containing z. X is called a descending cycle decomposition table. 
The permutation table is useful for products and inverses, and the descending 
cycle decomposition table is useful for the meet operation discussed later. The 
two tables can be converted in 0{n) time. Thus any one of them can be chosen 
to implement the braid groups without affecting the complexities of algorithms. 
We describe concrete algorithms in Algorithm 1 and 2. 

Algorithm 1 Convert a permutation table to a descending cycle decomposition 
table. 

Input: permutation table A of length n. 

Output: descending cycle decomposition table X . 

for z ^ 1 to rz do X\i] ^ 0; 

for z ^ zz to 1 step — 1 do begin 
if {X[i\ = 0) then X[i] ^ z; 
if (A[z] < z) then A[A[z]] ^ A[z]; 

end 
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Algorithm 2 Convert a descending cycle decomposition table to a permutation 
table. 

Input: Descending cycle decomposition table X. 

Output: Permutation table A. 

(We need an array Z of size n.) 

for z ^ 1 to n do Z[i] = 0; 
for z ^ 1 to n do begin 

if (Z[A[z]] = 0) then A[i] ^ X[i] else A[i] ^ Z[A[z]]; 

Z[A[z]] ^ z; 

end 



3.2 Operations 

Comparison. Two given canonical factors are identical if and only if their repre- 
sentations given by either permutation tables or descending cycle decomposition 
tables are identical. Thus the comparison is an 0{n) operation. 



Product and Inverse. The product and inverse operations in permutation 
groups are done in 0{n). If the product of two canonical factors is again a 
canonical factor, the composition of associated permutations is the permutation 
associated to the product in both presentations. Hence in this case the product 
of canonical factors is computed in 0{n). 



The Automorphism The automorphism t defined by r(a) = D~^aD 
sends canonical factors to canonical factors. An arbitrary power T^{a) for a 
canonical factor a can also be computed in 0{n), independent of u, since the 
permutation table of D“ can be obtained immediately from the parity (resp. the 
modulo n residue class) of u in the Artin (resp. band-generator) presentation. 



Meet. In the Artin presentation, an algorithm computing the meet of two 
canonical factors with O(rzlogn) running time and 0(n) space is known [3, 
Chapter 9] . We explain the idea of the algorithm briefly. Suppose that A and B 
are canonical factors and C = A Al B he the left meet. We view A, B and C as 
permutation tables. The algorithm sorts the integers 1, ... ,n according to the 
order defined by a: ^ y if and only C[x\ < C[y\. The final result is the per- 
mutation table of the inverse of C, and by inverting it the permutation table of C 
is obtained. Using the standard divide-conquer trick, we divide the sequence to 
be sorted into two parts, to say X and Y , sort each of X and Y recursively, and 
merge them according to In the merging step, we need to compare integers 
X G X and y &Y according to The essential point is that y ^ a; if and only 
if the infimum of A[i] over all z G A lying in the right-hand side of x is greater 
than the supremum of A[j] over all j & Y lying in the left-hand side of y, and 
the analogous condition holds for B. This can be checked in constant time using 
tables of infimums and supremums, which can be constructed before the merge 
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step in linear time proportional to the sum of the sizes of X and Y . Hence the 
total timing is equal to that of standard divide-conquer sorting, O(nlogn). We 
describe the left meet algorithm explicitly in Algorithm 3. 

Algorithm 3 Compute the meet of two canonical factors in the Artin presen- 
tation. 

Input: Permutation tables A, B 

Output: The permutation table C of the meet A Al B. 

(We need arrays U, V, W of size n.) 

Initialize C as the identity permutation; 

Sort C[l] ■ ■ ■ C[n] according to A and B (see the subalgorithm below); 

C <— inverse permutation of C; 

Subalgorithm: Sort C[s] • • • C[t] according to A and B. 

if t < s then return; 

m ^ [(s + t)/2j; 

Sort C[s] • • • C[m] according to A and B; 

Sort C[m + 1] • • • C[t] according to A and B; 

U[m] ^ A[C[m]]; 

V"|m] ^ B[C[m]]\ 

if s < m then 

for f ^ TO — 1 to s step — 1 do begin 
U[i] ^ vam{A[C[i]],U[i + 1]); 

V[i] ^ Tum{B[C[i]],V[i + 1]); 

end 

[/[to + 1] ^ A[C[m + 1]]; 
v\m + 1] ^ B[C[m + 1]]; 
if t > TO + 1 then 
for f <— TO + 2 to f do begin 

U[i] <— max(A[C[f]], U[i — 1]); 
v\i] ^ max(_B[C[f]], V[i — 1]); 

end 
I ^ s; 
r <— TO + 1; 
for i ^ s to t do begin 

if (/ > to) V ((r < t) A {U[l] > U[r]) A {V[l] > V[r])) 
then W[i] ^ C[r]\ r ^ r + 1; 
else w\i] ^ C[l]\ I ^ I 1; 
end 

for f ^ s to t do C[i] ^ W[i]] 

The right meet is computed in a similar way, or alternatively by the identity 
A Afl B = {A~^ Al where the inverse notations denote the inverses in 

the permutation group. 

In the band-generator presentation, it is known that the meet of two canonical 
factors can be computed in 0{n) time [1]. Basically, the meet is obtained by 
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computing the refinement of the two partitions of {1, . . . , n} that corresponds to 
the parallel descending cycle decompositions. We describe below an algorithm 
to compute the meet, which is an improved version of one in [1]. We remark that 
the left meet and the right meet are the same in the band-generator presentation. 

Algorithm 4 Compute the meet of two canonical factors in band-generator pre- 
sentation. 

Input: Descending cycle decomposition tables A and B. 

Output: The descending cycle decomposition table C of the meet 

aab. 

for t ^ 1 to n do U[i] ^ n — i -\- 1] 

Sort C/[l] • • • U[n] such that 

(A[[/[z]], C/[f]) is descending in the dictionary order; 

j ^ U[n\; C[j] ^ j; 

for z ^ n — 1 to 1 step — 1 do begin 

if {A[j] yf A[U\i]]) V {B[j] yf B[U\i]]) then j ^ U\i]; 

end 

The complexity is determined by the sorting step since all the other parts 
are done in linear time. In braid cryptosystems, it is expected that n is not so 
large (perhaps less than 500) and hence it is practically reasonable to apply the 
bucket sort algorithm. The bucket sort algorithm can be applied twice to sort 
pairs (A[C/[z]], i?[[/[z]], C/[z]) lexicographically. (Recall that the original order is 
preserved as much as possible by the bucket sort.) Since we have at most n 
possibilities for the values of A[[/[z]] and R[C/[z]], both space and execution time 
are linear in rz. In some situations, the following trade-off of space and execution 
time is useful. We may sort the pairs (A[[/[z]], i?[C/[z]]) using the bucket sort 
algorithm once, where 0(n^)-space is required but the practical execution speed 
is improved. To save space (e.g. on small platforms), usual sorting algorithms 
by comparisons (e.g. divide-conquer sort) can be applied to get an O(nlogrz) 
algorithm that requires no additional space. 

4 Braids 

4.1 Data Structures 

Writing a given braid as (3 = D'*AiA 2 • • • Ai, where q is an integer and each Ai 
is a canonical factor, we represent the braid as a pair (3 = {q, (Af)) of an integer 
q and a list of £ canonical factors {Ai) in both presentations. We note that this 
representation is not necessarily the left canonical form of (3, and hence £ may 
be greater than the canonical length of f3. 

A braid given as a word in generators is easily converted into the above form, 
in both presentations, by rewriting each negative power of generators as a 
product of and a canonical factor Dcr“^ and collecting every power of D 
at the left end using the fact (n Ai)D*^ = D*^(n T"*"^(Ai)) for any sequence 
of canonical factors Ai. This is done in 0{n£), where n is the braid index and £ 
is the length of the given word. 
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4.2 Operations 

Group Operations. Basic group operations are easily implemented. From the 
identity 



• • • Bt,) = ( 2 ) 

the multiplication of two braids is just the juxtaposition of two lists of permuta- 
tion and applying t. The inverse of a braid can be computed using the formula 

(D«Ai • • • (3) 

where Bi = viewing Ai and D as permutations. Since a power of r is 

computed in linear time in n, braid multiplication and inversion have complexity 
0(£n). A conjugation consists of two multiplications and one inversion, and hence 
also has the complexity 0(£n). 



Left Canonical Form. A representation of a braid can be converted into the 
left canonical form by the algorithms in [3, Chapter 9] and [1]. Given a positive 
braid P = Ai ■ ■ ■ Ai, where Ai is a canonical factor, the algorithm computes the 
maximal heads of . . . , Ai ■ ■ ■ A( = P sequentially using 

the following facts [3, Chaper 9] [2] [1]. 

1. For any positive braid A and P, the maximal head of AP is the maximal 
head of the product of A and the maximal head of P. 

2. For two canonical factors A and B, the maximal head of AB is A((DA“^) Al 
B), where the inverse is taken in the permutation group. 

From these facts, the z-th maximal head is the maximal head of the product 
of A^-i and the (z — l)-st maximal head, and it can be computed using meet 
operation once. At the last step, we obtain the left weighted decomposition 
P = BiPi. Doing it again for Pi, we obtain the left weighted decomposition 
Pi = B 2 P 2 , and repeating this, finally we obtain the left canonical form of P. 
Note that this process is very similar to the bubble sort, where the maximum 
(or minimum) of given elements is found at the first stage, and repeat it for the 
remaining elements. The complexity of left canonical form algorithm is the same 
as that of the bubble sort: complexities are 0{£'^nlogn) and 0{£'^n) in the Artin 
presentation and the band-generator presentations, respectively. The difference 
comes from the complexity of the meet operation. We describe the left canonical 
form algorithm in a concrete form. 

Algorithm 5 Convert a braid into the left canonical form. 

Input: A braid representation (3 = (p, (A^)). 

Output: The left canonical form of j3. 

£ ^ £(/ 3 ); 
i ^ 1; 
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while (z < 1) do begin 
t ^ 

for j ^ ^ — 1 to z step — 1 do begin 
B ^ Al Aj+i] 

if {B is nontrivial) then begin 
t < j\ Aj < AjB\ Aj^i < B Aj^i] 

end 

end 

z — f + 1; 

end 

while (^ > 0) A {A\ = D) do begin 

Remove Ai from /3; £ <— £ — 1; p ^ p + 1; 

end 

while (f > 0) A (A£ is trivial) do begin 
Remove Ag from /3; f ^ — 1; 

end 

The multiplications and inversions in lines 6 and 8 are performed viewing D, 
B and A^ as permutations. 

We remark that Algorithm 5 can be modified for parallel processing. For 
convenience, we denote the job of lines 6-9 for (z,j) by S{i,j). Then S{i,j) 
can be processed after S{i — 1, j — 1) is finished. Thus the jobs S'(l, k),S{2, k + 
2), . . . , S{£ — l,k + 2{£ — 2)) can be processed simultaneously fork = £—l,£ — 
2, . . . , 1, 0, —1, + 3. (S{i,j) for invalid (z, j) is ignored here.) This method 

offers algorithms with 0(£nlogn) and 0(£n) execution time in the Artin and 
the band-generator presentation, using 0{l) processors. 

Comparison. In order to compare two braids /3i and with and £2 canon- 
ical factors, we need to convert them into their canonical forms since the same 
braid can be represented in different forms. Assuming (3i and /?2 are in left 
canonical form, the comparison is done by comparing the exponents of D and 
the lists of canonical factors, and so has complexity 0(min{fi, f' 2 } • rz). Without 
the assumption, the total complexity of comparison is equal to that of the con- 
version into left canonical form, 0(min{£i, £ 2 } • zzlogrz) and ©(minimi, £ 2 } • n) 
for the Artin presentation and band-generator presentation, respectively. (Note 
that for comparison. Algorithm 5 can be executed simultaneously for /?i and P 2 
to extract the canonical factors in the left canonical forms, and stopped if either 
different canonical factors are found or nothing is left for any one of (3i and ^ 2 -) 

5 Random Braids 

Random braids play an important role in braid cryptosystems [5,7]. Since the 
braid group is discrete and infinite, a probability distribution on makes no 
sense. But there are finitely many positive n-braids with £ canonical factors, we 
may consider randomness for these braids. Since such a braid can be generated 
by concatenating £ random canonical factors, the problem is reduced to how to 
choose a random canonical factors in both presentations. 
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5.1 Artin Presentation 

In the Artin presentation of Bn, a canonical factor can be chosen randomly by 
generating a random n-permutation. It is well known that this is done by using 
a random number oracle (n — 1) times; we start with the identity permutation 
table A, and for i = 1, 2, . . . , n — 1, pick a random number j between i and n 
and swap A[i] and A[j\. 



5.2 Band- Generator Presentation 

In the band-generator presentation, we need more complicated arguments. Par- 
allel descending cycle decompositions can be identified with non-crossing parti- 
tions of the set {1, . . • ,n}. It is known that they are again naturally bijective 
to the set BSn of ballot sequences S\S 2 - • ■ S 2 n of length 2n, which are defined 
to be sequences satisfying Si -I- • • • -I- > 0 for all fc and Si -I- • • • -I- S 2 „ = 0 

(e.g. see [8]). Of course, \BSn\ is equal to the n-th Catalan number Cn- The 
recurrence relation 



Cn — C^Cn-l + CiCn-2 + ' ' ’ + Cn-lCg ( 4 ) 

can be naturally interpreted by means of ballot sequences as follows. For a given 
ballot sequence si • • • S 2 n, choose the minimal i such that si -I- • — h = 0. Then 
Si = 1, Si = —1 and the subsequences S 2 • • • Si_i and Si+i • • • S 2 n are again ballot 
sequences of length 2{i — l) and 2{n — i), respectively. This establishes a bijection 
between BSn and the disjoint union Ur=i^ BSi-i x BSn-i- We inductively define 
a linear order on BSn via the bijection, by the following rules: elements in 
BSi-i X BSn-i are smaller than elements in BSj-i x BSn-j if and only if f < j, 
and elements in BSi-i x BSn-i are lexicographically ordered. Then a random 
ballot sequence can be generated as follows. Choose a random number k between 
1 and Cn, and take the k-th ballot sequence. Algorithm 6 does the second step, 
by tracing the above bijection recursively. By an induction, it can be shown that 
the running time of Algorithms 6 is O(nlogn). 

Algorithm 6 Construct the k-th ballot sequence of length 2n. 

Input: An integer k between 1 and Cn- 
Output: The k-th ballot sequence Si • • • S 2 n- 

if fc < C^Cn-i then f ^ 1; 

elseif k > Cn — Cn-iCg then begin i ^ n\ k ^ k — Cn + Cn-iC^-, end 
else for f ^ 1 to n do 

if (fc < Ci-iCn-i) then break; 

else k ^ k — Ci-iCn-f, 

X ^ [k/Cn-i\ ■, y^k- xCn-t] 

Si 1; S2i-i < 1; 

if f > 1 then S2 • • • S2i-2 ^ the {x + l)-st ballot sequence of length 2{i — 1); 
if i < n then S 2 i • • • S 2 n ^ the {y + l)-st ballot sequence of length 2{n — i); 
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A ballot sequence can be transformed to a permutation table associated to a 
canonical factor in the band generator presentation, via the correspondence be- 
tween ballot sequences and non-crossing partitions of {1 , . . . , n} [8]. We describe 
an 0{n) algorithm. 

Algorithm 7 Convert a ballot sequence to a disjoint cycle decomposition table. 

Input: A ballot sequence si • • • S2n- 
Output: A permutation table A. 

(We need a stack S of maximal size n.) 

for i ^ 1 to 2n do begin 

if Sj = 1 then push i into S; 
else begin 

Pop j from S; 

if i is odd then A[{i + l)/2] = j'/2 
else A[j/2] = {i + l)/2; 
end 
end 

In the above discussion, we assume that the Catalan numbers C„ is known. 
It is not a severe problem, since a table of Cn can be computed very quickly 
using the recurrence relation C„+i = (4n -h 2)C„/(n -h 2). If you want to avoid 
division of big integers, the recurence relation (4) is useful. 

We finish this section with a remark on the distribution generated by out 
algorithm. Since the same braid can be represented in different ways in our 
implementation, the distribution is not uniform on the set of positive n-braids of 
canonical length f. However, the distribution has a property that more complex 
braids, which can be represented in more different ways, are generated with 
higher probability. It seems to be a nice property for braid cryptosystems. 

6 Performance 

In this section we consider the braid cryptosystem proposed in [7], which is a 
revised version of one in [5]. Let and UBn be the subgroups of generated 
by the Artin generators a\, , CT[„/2J -i and ct|^„/2J , ■ • ■ , cr„, respectively. A secret 
key is given as a pair (01,02), where oi and 02 are in and the associated 

public key is a pair {x,y) such that y = 01x02. The encryption and decryption 
scheme is as follows. 

Encryption Given a message m G {0, 1}'^, 

1. Choose 61,62 G UBn. 

2. Ciphertext is (ci,C2) = (61x62,76(611/62) 0 m). 

Decryption Given a ciphertext (ci, C2), m = i6(oiCi02) © C2. 

In the above scheme, H:Bn {0,1}^ is a collision-free hash function. H 
can be obtained by composing a collision free hash function of bitstrings into 
{0, l}'^ with a conversion function of braids into bitstrings. A braid given as its 
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left canonical form D“Ai • • • can be converted into a bitstring by dumping 
the integer u and the permutation tables of Ai as binary digits for i = 1,. . . 
sequentially. Since different braids are converted into different bitstrings, this 
conversion can be used as a part of the hash H. 

We remark that if the secret key is of the form (a,a“^) and is taken as 
62 in the above encryption procedure, the cryptosystem in [5] is obtained. Hence 
in performance issues, there is no difference between the cryptosystems in [7] 
and [5]. 

The above scheme is easily implemented based on our works. In the encryp- 
tion, two random braid generations, four multiplications and two left canonical 
form operations are involved. In the decryption, two multiplications and one 
left canonical form operation are involved. Thus both operations have running 
time 0{£^nlogn) and 0{£‘^n) in the Artin and the band-generator presentation, 
respectively. In Table 1, we show the performance of an implementation of the 
cryptosystem using the Artin presentation, at various security parameters sug- 
gested in [5]. The security levels are estimated using the results of [7]. In order 
to focus on the performance of braid operations, the execution time of the hash 
function is ignored. This experiment is performed on a computer with a Pentium 
III 866MHz processor. 



Table 1. Performance of the braid cryptosystem at various parameters 



n 


£ 


Block Size 
(Kbyte) 


Encryption Speed 
(Block/sec) (Kbyte/sec) 


Decryption Speed 
(Block/sec) (Kbyte/sec) 


Security 

Level 


100 


15 


1.97 


74.46 


146.53 


95.60 


188.13 


285 


150 


20 


4.36 


37.44 


163.40 


47.42 


206.94 


2125 


200 


30 


9.34 


17.21 


160.71 


22.30 


208.26 


2199 


250 


40 


16.36 


10.61 


173.66 


13.62 


222.78 


2280 



7 Conclusion 

Table 2 summaries braid algorithms discussed and their complexities. In Input 
and Output columns, PT, DT, AB and BB mean a permutation table, a de- 
scending cycle decomposition table, a braid given by the Artin presentation and 
a braid given by the band-generator presentations, respectively. As usual n is the 
braid index and £ the maximum of canonical lengths (or numbers of canonical 
factors) of input braids, except for the comparison algorithm, where £ denotes 
the minimum of canonical lengths of two given braids. The complexities of the al- 
gorithms are measured by the number of steps required. The space complexities 
of the algorithms are easily seen to be either constant or linear. 
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Table 2. Complexities of braid algorithms 



Operation 


Input 


Output 


Complexity 


Reference 


PT ^ DT 


PT 


DT 


0{n) 


Alg. 1 


DT ^ PT 


DT 


PT 


0{n) 


Alg. 2 


Product 


PT 


PT 


0{n) 


3.2 


Inverse 


PT 


PT 


0{n) 


3.2 




PT 


PT 


0{n) 


3.2 


Meet (Artin) 


PT 


PT 


0{n\ogn) 


Alg. 3 


Meet (Band) 


DT 


DT 


0{n) 


Alg. 4 


Comparison 


PT (or DT) 


True/False 


0{n) 


3.2 


Random (Artin) 




PT 


0{n) 


5.1 


Random (Band) 




PT 


0{n\ogn) 


5.2, Alg. 6, 7 


Product 


AB (or BB) 


AB (or BB) 


o\ln) 


4.2 


Inverse 


AB (or BB) 


AB (or BB) 


0{ln) 


4.2 


Left Canonical 


AB 


AB 


0{l^nlogn) 


Alg. 5 


Form 


BB 


BB 


0{fn) 


Alg. 5 


Comparison 


AB 


True/False 


0{£^nlogn) 


4.2 




BB 


True/False 


0{fn) 


4.2 


Random 




AB 


0{£n) 


5 






BB 


0{£n logn) 


5 
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Abstract. McEliece is one of the oldest known public key cryptosys- 
tems. Though it was less widely studied than RSA, it is remarkable 
that all known attacks are still exponential. It is widely believed that 
code-based cryptosystems like McEliece do not allow practical digital 
signatures. In the present paper we disprove this belief and show a 
way to build a practical signature scheme based on coding theory. Its 
security can be reduced in the random oracle model to the well-known 
syndrome decoding problem and the distinguishability of permuted 
binary Goppa codes from a random code. For example we propose a 
scheme with signatures of 81-bits and a binary security workfactor of 2®®. 

Keywords: digital signature, McEliece cryptosystem, Niederreiter cryp- 
tosystem, Goppa codes, syndrome decoding, short signatures. 



1 Introduction 

The RSA and the McEliece [11] public key cryptosystems, have been proposed 
back in the 70s. They are based on intractability of respectively factorization 
and syndrome decoding problem and both have successfully resisted more than 
20 years of cryptanalysis effort. 

RSA became the most widely used public key cryptosystem and McEliece 
was not quite as successful. Partly because it has a large public key, which is 
less a problem today, with huge memory capacities available at very low prices. 
However the main handicap was the belief that McEliece could not be used in 
signature. In the present paper we show that it is indeed possible to construct 
a signature scheme based on Niederreiter’s variant [12] on the McEliece cryp- 
tosystem. 

The cracking problem of RSA is the problem of extracting e-th roots modulo 
N called the RSA problem. All the general purpose attacks for it are structural 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 157-174, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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attacks that factor the modulus N. It is a hard problem but sub-exponential. 
The cracking problem for McEliece is the problem of decoding an error correcting 
code called Syndrome Decoding (SD). There is no efficient structural attacks 
that might distinguish between a permuted Goppa code used by McEliece and 
a random code. The problem SD is known to be NP-hard since the seminal 
paper of Berlekamp, McEliece and van Tilborg [3], in which authors show that 
complete decoding of a random code is NP-hard. 

All among several known attacks for SD are fully exponential (though faster 
than the exhaustive search [4]), and nobody has ever proposed an algorithm that 
behaves differently for complete decoding and the hounded decoding problems 
within a (slightly smaller) distance accessible to the owner of the trapdoor. In 
[6] Kobara and Imai review the overall security of McEliece and claim that 

[. . . ] without any decryption oracles and any partial knowledge on the 
corresponding plaintext of the challenge ciphertext, no polynomial-time 
algorithm is known for inverting the McEliece PKC whose parameters 
are carefully chosen. 

Thus it would be very interesting to dispose of signature schemes based on 
such hard decoding problems. The only solution available up to date was to use 
zero-knowledge schemes based on codes such as the SD scheme by Stern [19]. It 
gives excellent security but the signatures are very long. All tentatives to build 
practical schemes failed, see for example [20]. 

Any trapdoor function allows digital signatures by using the unique capacity 
of the owner of the public key to invert the function. However it can only be used 
to sign messages the hash value of which lies in the ciphertext space. Therefore 
a signature scheme based on trapdoor codes must achieve complete decoding. 
In the present paper we show how to achieve complete decoding of Goppa codes 
for some parameter choices. 

The paper is organized as follows. First we explain in §2 and §3 how and 
for which parameters to achieve complete decoding of Goppa codes. In §4 we 
present a practical and secure signature scheme we derive from this technique. 
Implementation issues are discussed in §5, and in particular, we present several 
tradeoffs to achieve either extremely short signatures (81 bits) or extremely fast 
verification. In §6 we present an asymptotic analysis of all the parameters of 
the system, proving that it will remain practical and secure with the evolution 
of computers. Finally in §7 we prove that the security of the system relies on 
the syndrome decoding problem and the distinguishability of Goppa codes from 
random codes. 

2 Signature with McEliece 

The McEliece cryptographic scheme is based on error correcting codes. It consists 
in randomly adding errors to a codeword (as it would happen in a noisy channel) 
and uses this as a cipher. The decryption is done exactly as it would be done 
to correct natural transmission errors. The security of this scheme simply relies 
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on the difficulty of decoding a word without any knowledge of the structure of 
the code. Only the legal user can decode easily using the trap. The Niederreiter 
variant - equivalent on a security point of view [8] - uses a syndrome (see below) 
as ciphertext, and the message is an error pattern instead of a codeword (see 
Table 1). 

2.1 A Brief Description of McEliece’s and Niederreiter’s Schemes 

Let F 2 be the field with two elements {0, 1}. In the present paper, C will sys- 
tematically denote a binary linear code of length n and dimension k, that is 
a subspace of dimension k of the vector space Ff . Elements of F 2 are called 
words, and elements of C are codewords. A code is usually given in the form 
of a generating matrix G, lines of which form a basis of the code. The parity 
check matrix H is a, dual form of this generating matrix: it is the n x (n — k) 
matrix of the application of kernel C. When you multiply a word (a codeword 
with an error for example) by the parity check matrix you obtain what is called 
a syndrome: it has a length of n — fc bits and is characteristic of the error added 
to the codeword. It is the sum of the columns of H corresponding to the non- 
zero coordinates of the error pattern. Having a zero syndrome characterizes the 
codeword and we have G x H = 0. 

Let G he a binary linear code of length n and dimension k correcting t errors 
(i.e. minimum distance is at least 2t + 1). Let G and H denote respectively a 
generator and a parity check matrix of G . Table 1 briefly describes the two main 
encryption schemes based on code. In both case the trap is a t-error correct- 

Table 1. McEliece and Niederreiter code- based cryptosystems 



McEliece 



public key: G 

cleartext: x G F 2 

ciphertext: y = xG + e, Wh (e) 
ciphertext space: 



Niederreiter 

H 

X € F2, wh{x) = t 
t y — 

F n — k 
2 



ing procedure for G. It enables decryption {i.e. finding the closest codeword 
to a given word or equivalently the word of smallest Hamming weight with a 
prescribed syndrome). 

The secret key is a code Cq (usually a Goppa code) whose algebraic structure 
provides a fast decoder. The public code is obtained by randomly permuting the 
coordinates of Cq and then choosing a random generator or parity check matrix: 

G = UGoP or H = VHqP 

where Go and iLo are a generator and a parity check matrix of Go, U and V are 
non-singular matrices {k x k and {n — k) x {n — k) respectively) and P is a nxn 
permutation matrix. 
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The security of these two systems is proven to be equivalent [8] and is based 
on two assumptions: 

— solving an instance of the decoding problem is difficult, 

— recovering the underlying structure of the code is difficult. 

The first assumption is enforced by complexity theory results [3,2,16], and by 
extensive research on general purpose decoders [7,18,4]. The second assumption 
received less attention. Still the Goppa codes used in McEliece are known by 
coding theorists for thirty years and so far no polynomially computable property 
is known to distinguish a permuted Goppa code from a random linear code. 

2.2 How to Make a Signature 

In order to obtain an efficient digital signature we need two things: an algorithm 
able to compute a signature for any document such that they identify their 
author uniquely, and a fast verification algorithm available to everyone. 

A public-key encryption function can be used as a signature scheme as follows: 

1 . hash (with a public hash algorithm) the document to be signed, 

2. decrypt this hash value as if it were an instance of ciphertext, 

3. append the decrypted message to the document as a signature. 

Verification just applies the public encryption function to the signature and 
verifies that the result is indeed the hash value of the document. In the case 
of Niederreiter or any other cryptosystem based on error correcting codes the 
point 2 fails. The reason is that if one considers a random syndrome it usually 
corresponds to an error pattern of weight greater than t. In other word, it is 
difficult to generate a random ciphertext unless it is explicitly produced as an 
output of the encryption algorithm. 

One solution to the problem is to obtain for our code an algorithm to decode 
any syndrome, or at least a good proportion of them. It is the object of the next 
section. 



2.3 Complete Decoding 

Gomplete decoding consists of finding a nearest codeword to any given word of 
the space. In a syndrome language that is being able to find an error pattern 
corresponding to any given syndrome. This means decoding syndromes corre- 
sponding to errors of weight greater than t. 

An approach to try to perform complete decoding would be to try to correct 
a fixed additional number of errors (say d). To decode a syndrome correspond- 
ing to an error of weight t + S one should then add S random columns from the 
parity check matrix to the syndrome and try to decode it. If all of the S columns 
correspond to some error positions then the new syndrome obtained will cor- 
respond to a word of weight t and can be decoded by our trapdoor function. 
Else we will just have to try again with S other columns, and so on until we can 
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Fig. 1. From bounded decoding to complete decoding 



decode one syndrome. Like this we can decode any syndrome corresponding to 
an error of weight less than or equal to t + <5. If 5 is large enough we should be 
able to decode any possible syndrome. However, a large 6 will lead to a small 
probability of success for each choice of S columns. This means that we will have 
to adapt the parameters of our code to obtain a 6 small enough and in the same 
time keep a good security for our system. 

This can be viewed from an different angle. Adding a random column of the 
parity check matrix to a syndrome really looks like choosing another random 
syndrome and trying to decode it. Choosing parameters for the code such that 
S is small enough simply consists of increasing the density of the decodable 
syndromes in the space of all the syndromes, this is increasing the probability 
for a random syndrome to be decodable. This method will therefore take a first 
random syndrome (given by the hash function) and try to decode it, then modify 
the document and hash it again until a decodable syndrome is obtained. 

The object of the next section will be to choose parameters such that the 
number of necessary attempts is small enough for this method to work in a 
reasonable time. 



3 Finding the Proper Parameters 

The parameters of the problem are the dimension k of the code, its length n and 
the maximum number t of errors the code can correct. These parameters affect 
all aspects of the signature scheme: its security, the algorithmic complexity for 
computing a signature, the length of the signature... We will start by explor- 
ing the reasons why the classical McEliece parameters are not acceptable and 
continue with what we wish to obtain. 



3.1 Need for New Parameters 

With the classical McEliece parameters (n = 1024, k = 524, t = 50) we have 
syndromes of length n — k = 500. This makes a total of 2®°° syndromes. Among 




162 



N.T. Courtois, M. Finiasz, and N. Sendrier 



these only those corresponding to words of weight less than 50 are decodable. 
The number of such syndromes is: 

Therefore there is only a probability of 2“^^® of success for each syndrome. 
This would mean an average number of decoding attempts of 2^^® which is far 
too much. We will hence have to change the values of n, k and t. 

3.2 Choosing Parameters 

Binary Goppa codes are subfield subcodes of particular alternant codes [10, 
Ch. 12]. For a given integer m, there are many (about 2*™/t) t-error correcting 
Goppa codes of dimension n — tm and length n = 2™. 

We are looking for parameters which lead to a good probability of success for 
each random syndrome. The probability of success will be the ratio between the 
number of decodable syndromes Afdec £md the total number of syndromes Aftot- 
As n is large compared with t we have: 




and for Goppa codes Aftot = 2” ^ = 2™* = n*. Therefore the probability of 
success is: 

.p Af dec ^ 1 

Aftot tl 

This probability doesn’t depend of n and the decoding algorithm has a poly- 
nomial complexity in to (= log 2 n) and t. Therefore the signature time won’t 
change a lot with n. As the security of the Goppa code used increases rapidly 
with n we will then be sure to find suitable parameters, both for the signature 
time and the security. 

3.3 Secure Parameters 

A fast bounded decoding algorithm can perform about one million decoding in a 
few minutes^. From the previous section, the number of decoding attempt to get 
one signature will be around t\, so get a reasonable signature scheme, t should 
not be more than 10. However for the codes correcting such a little number of 
errors we need to have very long codewords in order to achieve good security. 

The Table 2 shows the binary workfactors for the Ganteaut-Ghabaud attack 
[4] on the McEliece cryptosystem (see section 6 for more details on the complexity 
of these attacks). We assume that an acceptable security level is of 2®® GPU 
operations, corresponding roughly to a binary workfactor of 2®®. Therefore, in 
our signature scheme, we need a length of at least 2^® with 10 errors or 2^® with 
9 errors. 

Though it is slightly below or security requirement, the choice (2^®, 9) is 
better as it runs about 10 times faster. 

^ our implementation performs one million decodings in 5 minutes, but it can still be 
improved 
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Table 2. Cost for decoding 







n 1 






2 "" 


21 Z 


2 ^'’ 


2 ^ 


210 


2^ 


2 "' 


t = 


8 




256.6 


2BTZT 


265.3 




27075- 


273ZT 


t = 


9 


254.6 


259.9 


269.3 


274.0 


278.8 


283.7 


288.2 


t = 


10 


26 O .9 


266.8 


272.3 


277.4 


287.4 


290.9 


294.6 



4 The Signature Scheme 

With the chosen parameters we have a probability of 1/9! to decode each syn- 
drome. We will therefore have to try to decode about 9! random syndromes. To 
do so we will simply use a counter i and hash it with the document: the hashed 
syndrome obtained will then depend of i, and by changing i we can have as many 
as we need. The signature scheme works as follows. 

Let ft. be a hash function returning a binary word of length n — k (the length 
of a syndrome). Let D be our document and s = h{D). We denote [ • • • s • • • | • i-] 
the concatenation of s and i and Si = ft([ • • • s • • • | • i-]). 

The signature algorithm will compute the Si for i starting at 0 and increasing 
by 1 at each try, until one of the syndromes Si is decodable. We will note ig the 
first index for which Si is decodable, and we will use this syndrome for the 
signature. As explained in section 2.2 the signature will then be the decrypted 
message, which is in our case the word 2 : of length n and weight 9, such that 
Hz^ = Sig. However the signature will also have to include the value of io for 
the verification. The signature will therefore be [• ■ ■ z ■ ■ ■ \ ■ tg-]. 

Signature length: the length of the signature will mainly depend of the way 
used to store z. It is a word of length n = 2^® so the dumb method would be 
to use it directly to sign. However its weight is only 9 so we should be able to 
compress it a little. There are (^g ) ~ 2 ^^® ® word of weight 9 so they could be 
indexed with a 126 bit counter. Let i\ < . . . < ig denote the positions of the 
non-zero bits of z. We define the index of z by: 




The number of bits used to store ig isn’t reducible: in average it’s length is 
log2(9!) ~ 18.4 bits. So the signature will be [ • • • /z • • • | • ig-] with an average 
total length of 125.5 -I- 18.4 ~ 144 bits. 

Note that using McEliece encryption scheme instead of Niederreiter’s would 
not be satisfactory here. The signature would have a size larger than k bits (the 
size of a plaintext). And it would grow radiply with m if t is small. With the 
parameters above, the signature would have a length of 65411 bits! 
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Signature algorithm 

— hash the document D into s = h{D) 

— compute Si = h{[ ■ ■ ■ s ■ • • \ • i-]) for i = 0, 1, 2 . . . 

— find io the smallest value of i such that Si is decodable 

— use our trapdoor function to compute z such that Hz"'" = Sig 

— compute the index of z in the space of words of weight 9 

— use [■ ■ ■ Iz - ■ ■ I • to-] as a signature for D 



Verification algorithm is much simpler (and faster) 

— recover z from its index Iz 

— compute Si = Hz"'" with the public key H 

— compute S 2 = h{[ • • • h{D) • • • | • ip-]) with the public hash function 

— compare si and S 2 : if they are equal the signature is valid 



4.1 Attacks on the Signature Length 

Having such short signatures enables attacks independent on the strength of the 
trapdoor function used, which are inherent to the commonly used method of 
computing a signature by inversion of the function. This generic attack runs in 
the square root of the exhaustive search. Let F be any trapdoor function with an 
output space of cardinality 2’’ . The well known birthday paradox forgery attack 
computes 2’’/^ hash^ values MD(mj) for some chosen messages, and picks at ran- 
dom 2’’/^ possible signatures. One of these signatures is expected to correspond 
to one of the messages. 

With our parameters the syndromes have a length of 144 bits and the com- 
plexity of the attack is the complexity of sorting the = 2^^ values which 

is 2 '^ X 72 X 144 ~ 2®^ binary operations. This attack is not more threatening 
than the decoding attack, and in addition it requires a memory of about 2^^ x 72 
bits. Note also that the above attack depends on the syndrome length and not 
on the signature length, this will remain true later, even in the variants with 
shorter signature length. 



5 Implementation Aspects 

For any signature scheme there is an easy security preserving tradeoff between 
signature length and verification time. One may remove any h bits from the 
signature if one accepts exhaustive verification in 2^ for each possible value of 
the h missing bits. In the case of syndrome-based signature, one can do much 
better. As the signature consists of an error pattern of weight t, one may send 

^ MD denotes a cryptographic hash function with output of r bits 
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only t — 1 out of the t errors. The verifier needs to decode the remaining error 
and this is much faster that the exhaustive search. More generally we are going 
to show that concealing a few errors (between 1 and 3) remains an excellent 
compromise as summarized in Table 3. 

5.1 Cost of a Verification 

Let s denote the hash value of the message and z denote the error pattern of 
weight t such that Hz^ = s. As z is the signature, we can compute y = Hz'^ hy 
adding the t corresponding columns. The signature is accepted if y is the equal 
to s. The total cost of this verification is t column operations^. 

If w is a word of weight t — 1 whose support is included in the support of z, 
we compute y = s + HvF , which costs t — 1 column operations, and we check 
that y is a column of H , which does not cost more than one column operation 
if the matrix H is properly stored in a hash table. 

Omitting two errors. Let us assume now that the word u transmitted as 
signature has weight t — 2. There exists a word x of weight 2 such that = 
y = HvF . We are looking for two columns of H whose sum is equal to y. All we 
have to do is to add y to any column of H and look for a match in H. Again if 
the columns of H are properly stored, the cost is at most 2n column operations. 

This can be improved as the signer can choose which 2 errors are left to 
verifier to correct and omits in priority the positions which will be tested first, 
this divides the complexity in average by t {i.e. the match will be found in 
average after n/t tries). 



Omitting more errors. In general, if u has weight t — w, we put y = s + HvF 
and we need to compute the sum of y plus any re — 1 columns of H and check 
for a match among the columns of H . Proper implementation will cost at most 
3(^,”i) column operations (yes, it is always 3, don’t ask why!). 

Again, if the signer omits the set of w errors which are tested first, the average 
cost can be divided by 

Note that if more than 2 errors are not transmitted, the advantage is not 
better than the straightforward time/length tradeoff. 

5.2 Partitioning the Support 

Punctured code. Puncturing a code in p positions consist in removing the 
corresponding coordinates from the codewords. The resulting code has length 
n — p and, in general, the same dimension^ k. Without loss of generality we can 

® In this section we will count all complexities in terms of column operations, one 
column operation is typically one access to a table and one operation like an addition 
or a comparison 

^ the actual dimension is the rank of a matrix derived from a generating matrix by 
removing the p columns 
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Table 3. Tradeoffs for the 9-error correcting Goppa code of length 2^® 



remaining 


cost 


of 


signature 


errors 


verification 


length 


0 


t 


9 


144 bits 


1 


t 


9 


132 bits 


2 


CO CO 


214 


119 bits 


3 


227 


105 bits 


4 


240 


92 bits 



in column operations (« 4 to 8 CPU clocks). 



assume that the punctured positions come first. A parity check matrix H' of C 
can be derived from any parity check matrix i? of C by a Gaussian elimination: 
for some non-singular (n — k) x (n — k) matrix U we have 



UH = 



I 


R ' 


0 


H' 



where I denotes the p x p identity matrix. 

Given a syndrome s we wish to find z G of weight t such that s = . 

We denote s" and s' respectively the p first and the n — p—k last bits of Gs and 
z' the last n — p bits of z. Let w <t denote the weight of z'. 



(s = Hz^ I s' = H'z'^ 

\ Wh(z) <t I Wh(z') + Wh(Rz'^ + s") < t 



Shorter signatures. We keep the notations of the previous section. We parti- 
tion the support of C into n/l sets of size 1. Instead of giving the t — w positions, 
we give the t — w sets containing these positions. These p = l{t — w) positions 
are punctured to produce the code C . To verify the signature s we now have 
to correct w errors in C , i.e. find z' of weight w such that s' = Hz'"^ . The 
signature is valid if there exists a word z' such that 

wh ( z ') < w (1) 

wniRz''^ + s") <t — w (2) 

We may find several values of z' verifying (1), but only one of them will also 
verify (2). If I is large, we have to check equation (2) often. On the other hand, 
large values of I produce shorter signatures. The best compromise is ^ = m or a 
few units more. 

The cost for computing H' is around column operations (indepen- 

dently of I and w). The number of column operations for decoding errors in C' 
is the same as in C but columns are smaller. 

The signature size will be log 2 more than 3 errors are not trans- 

mitted, the length gain is not advantageous. 
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Table 4. Tradeoffs for m = 16, t = 9 and I = m 



remaining 
errors (ii?) 


cost of 

verification 


signature 

length 


1 


2^2 


100 bits 


2 


222 


91 bits 


3 


227 


81 bits 


4 


240 


72 bits 



in column operations (« 2 to 6 CPU clocks). 



5.3 New Short Signature Schemes 

With parameters m = 16 and t = 9, there are three interesting trade-offs be- 
tween verification time and signature length. All three of them have the same 
complexity for computing the signature (in our our implementation the order of 
magnitude is one minute) and the same security level of about CPU opera- 
tions. 

Fast verification (CFSl). We transmit 8 out of the 9 error positions, the veri- 
fication is extremely fast and the average signature length is log 2 = 

131.1 < 132 bits. 

Short signature (CFS3). We partition the support in 2^^ cells of 16 bits and 
we transmit 6 of the 9 cells. The verification time is relatively long, around 
one second and the average signature length is log 2 = 80.9 < 81 

bits. 

Half & half (CFS2). We transmit the rightmost 7 error positions (out of 9). 
The verification algorithm starting with the left positions will be relatively 
fast in average, less than one millisecond. The average signature length is 
log 2 (i!(t” 2 )) “ 118.1 < 119 bits. 

In all three cases, to obtain a constant length signature one should be able to 
upper bound the number of decoding attempts. This is not possible, however by 
adding 5 bits to the signature the probability of failing to sign a message is less 
than 2“'^®, and with 6 bits it drops to 2“®^. 

5.4 Related Work 

It seems that up till now the only signature scheme that allowed such short 
signatures was Quartz [14] based on HFE cryptosystem [13]. It is enabled by 
a specific construction that involves several decryptions in order to avoid the 
birthday paradox forgery described in 4.1 that runs in the square root of the 
exhaustive search. This method is apparently unique to multivariate quadratic 
cryptosystems such as HFE and works only if the best attack on the underlying 
trapdoor is well above the square root of the exhaustive search [13,14]. Such is 
not the case for the syndrome decoding problems. 
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6 Asymptotic Behavior 

In order to measure the scalability of the system, we will examine here how the 
complexity for computing a signature and the cost of the best known attack 
evolve asymptotically. We consider a family of binary t-error correcting Goppa 
codes of length n = 2™. These codes have dimension k = n — tm. 

6.1 Signature Cost 

We need to make t\ decoding attempts, for each of these attempts we need the 
following. 

1. Compute the syndrome. As we are using Niederreiter’s scheme we already 
have the syndrome, we only need to expand it into something usable by the 
decoder for alternant codes, the vector needed has a size of 2tm bits and 
is obtained from the syndrome by a linear operation, this costs 0{t^m^) 
operations in F 2 . 

2. Solve the key equation. In this part, we apply Berlekamp-Massey algorithm 
to obtain the locator polynomial <j{z), this costs 0{t^) operations in ¥ 2 ^. 

3. Find the roots of the locator polynomial. If the syndrome is decodable, 
the polynomial cr(z) splits in F 2 *»»[z] and its roots give the error posi- 
tions. Actually we only need to check that the polynomial splits: that is 
gcd{a{z), — z) = a{z). This requires operations in F 2 m. 

We will assume that one operation in F 2 m requires mf operations in F 2 , the total 
number of operations in F 2 to achieve a signature is thus proportional to tlt^m^. 

6.2 Best Attacks Complexity 

Decoding attacks. The best known (and implemented) attack by decoding is 
by Canteaut and Chabaud [4] and its asymptotic time complexity is (empirically) 
around (n/log 2 where f{t) = At — c is an affine function with A not much 

smaller than 1 and c is a small constant between 1 and 2. 

Good estimates of the asymptotic behavior of the complexity of the best 
known general decoding techniques are given by Barg in [2]. In fact, when the 
rate R = k/n of the code tends to I, the time and space complexity becomes 
2 n(i-fi)/ 2 (n-o(i))^ which, for Goppa codes, gives 

Structural attack. Very little is know about the distinguishability of Goppa 
codes. In practice, the only structural attack [9] consists in enumerating all 
Goppa codes and then testing equivalence with the public key. The code equiva- 
lence problem is difficult in theory [15] but easy in practice [17]. There are 2*'"/t 
binary t-error correcting Goppa codes of length n = 2™, because of the properties 
of extended Goppa codes [10, Gh. 12, §4] only one out of mn^ must be tested 
and, finally, the cost for equivalence testing cannot be lower than n{tm)'^ (a 
Gaussian elimination) . Putting everything together leads to a structural attack 
whose cost is not less than tmn*~^ elementary operations. 
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Table 5. Characteristics of the signature scheme based on a (n = 2"‘, k — n — tm, d > 
2t + 1) binary Goppa code 



signature cost 




signature length^ 


{t — l)m + log 2 t 


verification cost^ 


t^m 


public key size 


tm2^ 


cost of best decoding attack 


2tm(l/2+o(l)) 


cost of best structural attack 


tm2’"d-2) 



^One error position omitted 



6.3 Intrinsic Strengths and Limitations 

In Table 5 all complexities are expressed in terms of t and m = log 2 n and we 
may state the following facts: 

— the signature cost depends exponentially of t, 

— the public-key size depends exponentially of m, 

— the security depends exponentially of the product tm. 

From this we can draw the conclusion that if the system is safe today it can only 
be better tomorrow, as its security will depend exponentially of the signature 
size. On the other hand the signature cost and the key size will always remain 
high, as we will need to increase f or m or both to maintain a good security level. 
However, relatively to the technology, this handicap will never be as important 
as it is today and will even decrease rapidly. 

7 Security Arguments 

In this section we reduce the security of the proposed scheme in the random 
oracle model to two basic assumptions concerning hardness of general purpose 
decoding and pseudo-randomness of Goppa codes. We have already measured the 
security in terms of the work factor of the best known decoding and structural 
attacks. We have seen how the algorithmic complexity of these attacks will evolve 
asymptotically. The purpose of the present section is to give a formal proof that 
breaking the CFS signature scheme implies a breakthrough in one of two well 
identified problems. This reduction gives an important indication on where the 
cryptanalytic efforts should be directed. 

One of these problem is decoding, it has been widely studied and a major 
improvement is unlikely in the near future. The other problem is connected 
to the classification of Goppa codes or linear codes in general. Glassification 
issues are in the core of coding theory since its emergence in the 50’s. So far 
nothing significant is known about Goppa codes, more precisely there is no 
known property invariant by permutation and computable in polynomial time 
which characterizes Goppa codes. Finding such a property or proving that none 
exists would be an important breakthrough in coding theory and would also 
probably seal the fate, for good or ill, of Goppa code-based cryptosystems. 
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7.1 Indistinguishability of Permuted Goppa Codes 

Definition 1 (Distinguishers). A T-time distinguisher is a probabilistic Tur- 
ing machine running in time T or less such that it takes a given F as an input 
and outputs equal to 0 or 1. The probability it outputs 1 on F with respect 
to some probability distribution T is denoted as: 

Pr[F ^F:A^ = l] 



Definition 2 ((T, e)-PRC). Let A be a T-time distinguisher. Let RND(n, k) be 
the uniform probability distribution of all binary linear (n,k)-code. Let F{n,k) 
be any other probability distribution. We define the distinguisher’ s advantage as: 



AdvP^(A) 



Pr[F ^ F{n, k) : A^ = 1] ~ Pr[F ^ RND(n, k) : A^ = 1] 



We say that F{n,k) is a (T, £)-PRC (Pseudo-Random Code) if we have: 

max Adv^^'"{A) < e. 

T— time .4 



7.2 Hardness of Decoding 

In this section we examine the relationships between signature forging and two 
well-known problems, the syndrome decoding problem and the bounded- distance 
decoding problem. The first is NP-complete and the second is conjectured NP- 
hard. 

Definition 3 (Syndrome Decoding - SD). 

Instance: A binary r x n matrix H, a word s ofFf, and an integer w > 0. 
Problem: Ls there a word x in of weight < w such that = s? 

This decision problem was proven NP-complete [3] . Achieving complete decoding 
for any code can be done by a polynomial (in n) number of calls to SD. Actually 
the instances of SD involved in breaking code-based systems are in a particular 
subclass of SD where the weight w is bounded by the half of the minimum 
distance of the code of parity check matrix FI. Is has been stated by Vardy in 
[16] as: 

Definition 4 (Bounded-Distance Decoding - BD). 

Instance: An integer d, a binary r xn matrix H such that every d—1 columns 
of H are linearly independent, a word s ofFf, and an integer w < {d—l)/2. 
Problem: Ls there a word x in Fj* of weight < w such that Hx"^ = s? 

It is probably not NP because the condition on H is NP-hard to check. However 
several prominent authors [1,16] conjecture that BD is NP-hard. 
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Relating signature forging and BD. An attacker who wishes to forge for a 
message M a signature of weight t with the public key H, has to find a word 
of weight t whose syndrome lies in the set {h{M,i) \ i G N} where h{) is a 
proper cryptographic hash function (see §4). Under the random oracle model, 
the only possibility for the forger is to generate any number of syndromes of 
the form h(M, i) and to decode one of them this cannot be easier than BD(2t + 
1, i/, h{M, i),t) for some integer i. 

Relating signature forging and SD. Let us consider the following problem: 



Definition 5 (List Bounded-Distance Decoding - LBD). 

Instance: An integer d, a binary rxn matrix H such that every d— 1 columns of 
H are linearly independent, a subset S o/FJ, and an integer w < [(d— 1)/2J . 
Problem: Is there a word x in F 2 of weight < w such that G S? 

Using this problem we will show how we may relate the forging of a signature 
to an instance of SD: 

— In practice the forger must at least solve LBD(2t + l,H,S,t) where S C 
{h{M,i) I t G N}. The probability for the set S to contain at least one 
correctable syndrome is greater than 1 — e“^ where A = |S'|(”)/2’’. This 
probability can be made arbitrarily close to one if the forger can handle a 
set S big enough. 

— Similarely, from any syndrome s G FJ, one can derive a set Rg^s C {s+Hu"^ \ 
u G F 2 ,wh{u) < d} where 6 = dyg — t and dyg is an integer such that 
(^" ) > 2’’. With probability close to 1— where n = |i?s, 5 | (”) /2’’, we have 
LBD(2t+ l,H, Rs,s,t) = SF){H,s,dvg). Thus solving LBD(2t+ l,H,Rs^s,t) 
is at least as hard as solving SD(id, s,dyg). 

— We would like to conclude now that forging a signature is at least as hard 
as solving SD(id, s, dyg) for some s. This would be true if solving LBD(2t + 
l,H, S, f) was harder than solving LBD(2t + 1, H, t) for some s, which 
seems difficult to state. Nevertheless, with sets S and Rg s of same size, it 
seems possible to believe that the random set (S) will not be the easiest to 
deal with. 

Though the security claims for our signature scheme will rely on the difficulty 
of BD(2t + l,H,s,f), it is our belief that it can reduced to the hardness of 
SD(iL, s, dyg) (note that dyg depends only of n and r, not of t). If we assume the 
pseudo-randomness of the hash function h{) and of Goppa codes these instances 
are very generic. 

7.3 Security Reduction 

We assume that the permuted Goppa code used in our signature scheme is a 
{Tcoppa, I/2)-PRC, i.e. it cannot be distinguished from a random code with an 
advantage greater than 1/2 for all adversaries running in time < Tcoppa- 
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We assume that an instance of BD(2t + 1, iJ, s, t) where H and s are chosen 
randomly cannot be solved with probability greater than 1/2 by an adversary 
running in time < Tbd- 

Theorem 1 (Security of CFS). Under the random oracle assumption, a T- 
time algorithm that is able to compute a valid pair messaged- signature for CFS 
with a probability >1/2 satisfies: 

T > min {Tcoppa, Tbd) ■ 

Proof (sketch): Forging a signature is at least as hard as solving BD(2t+l, H,s,t) 
where s = h(M, i) (see §7.2) and H is the public key. Under the random oracle 
assumption, the syndrome h(M, i) can be considered as random. If someone is 
able to forge a signature in time T < Tbd, then with probability 1/2 the matrix 
H has been distinguished from a random one and we have T > Tcoppa- 

8 Conclusion 

We demonstrated how to achieve digital signatures with the McEliece public key 
cryptosystem. We propose 3 schemes that have tight security proofs in random 
oracle model. They are based on the well known hard syndrome decoding problem 
that after some 30 years of research is still exponential. The Table 6 summarizes 
the concrete security of our schemes compared to some other known signature 
schemes. 



Table 6. McEliece compared to some known signature schemes 



base cryptosystem 


RSA 


ElGamal 


EC 


HFE 


McEliece/Niederreiter 


signature scheme 


RSA 


DSA 


ECDSA 


Quartz 


CFSl 1 CFS2 1 CFS3 


data size(s) 


1024 


160/1024 


160 


100 


144 



security 




structural problem 


factoring 


DL(p) 


Nechaev 

group? 


HFEv- 


? 

Goppa > PRCode 


best structural attack 




2 ^ 


oo 


> 2 ^'' 


2 ™ 


inversion problem 


RSAP 


DL(q) 


EC DL 


MQ 


SD 


best inversion attack 


-^TU2 




2™ 


— 2TDD— 


2»3 



efficiency 




signature length 


1024 


320 


321 


128 


132 1 119 1 81 


public key [kbytes] 


0.2 


0.1 


0.1 


71 


1152 


signature time 1 GHz 


9 ms 


1.5 ms 


5 ms 


15 s 


10 - 30 s 


verification time 1 GHz 


9 ms 


2 ms 


6 ms 


40 ms 


< 1 /iS < 1 ms « Is 



The proposed McEliece-based signature schemes have unique features that 
will make it an exclusive choice for some applications while excluding other. On 
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one hand, we have seen that both key size and signing cost will remain high, but 
will evolve favorably with technology. On the other hand the signature length 
and verification cost will always remain extremely small. Therefore if there is no 
major breakthrough in decoding algorithms, it should be easy to keep up with 
the Moore’s law. 
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Abstract. We use powerful new techniques for list decoding error- 
correcting codes to efficiently trace traitors. Although much work has 
focused on constructing traceability schemes, the complexity of the trac- 
ing algorithm has received little attention. Because the TA tracing al- 
gorithm has a runtime of 0{N) in general, where N is the number of 
users, it is inefficient for large populations. We produce schemes for which 
the TA algorithm is very fast. The IPP tracing algorithm, though less 
efficient, can list all coalitions capable of constructing a given pirate. 
We give evidence that when using an algebraic structure, the ability 
to trace with the IPP algorithm implies the ability to trace with the 
TA algorithm. We also construct schemes with an algorithm that hnds 
all possible traitor coalitions faster than the IPP algorithm. Finally, we 
suggest uses for other decoding techniques in the presence of additional 
information about traitor behavior. 



1 Introduction 

Traceability schemes are introduced in [9] and have been extensively studied in 
the intervening years for use as a piracy deterrent. We focus on one of the few 
aspects of this area of work that has received little attention: the complexity 
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of the traitor tracing algorithms. We show that powerful new techniques for 
the list decoding of error-correcting codes enable us to construct traceability 
schemes with very fast traitor tracing algorithms. Further, we use list decoding 
to give new algorithms for producing a list of all coalitions capable of creating 
a given pirate. In addition, we discuss potential applications of other decoding 
methods to the problem of tracing traitors, suggest alternative approaches when 
additional information is known about the way the traitors are operating, and 
examine the relationship between two important tracing algorithms. 

In a popular model for traceability schemes a unique set (possibly ordered) 
of r symbols is associated with each user. For example, the set may be associ- 
ated with a user’s software CD, or contained in a smartcard the user has for 
the purpose of viewing encrypted pay-TV programs (in the latter case, the set 
corresponds to a set of keys). When a coalition forms to commit piracy, it must 
construct a set to associate with the pirate object. In the case of unordered sets, 
this pirate set consists of r symbols, each of which belongs to at least one coali- 
tion member’s set. If the sets are ordered, the coalition members must form an 
ordered pirate set in which the symbol in each position is identical to the symbol 
in the same position in the ordered set of some coalition member. In either sce- 
nario a traitor tracing algorithm is applied to the pirate, and identifies an actual 
traitor or traitors. The approach we take here is to use error-correcting codes 
to construct traceability schemes in which the sets are ordered. The ordered 
(as opposed to the unordered) set scenario yields naturally to coding theoretic 
techniques and has many practical applications ([10,7]). 

We first focus on the TA traitor tracing algorithm (following the terminology 
in [40]), that identifies as traitors all users who share the most with the pirate. 
In general the TA algorithm runs in 0{N) time, where N is the number of 
users. However, this paper shows that for suitable constructions based on error- 
correcting codes, tracing can be accomplished in time polynomial in clogA^, 
where c is the maximum coalition size. This is a significant improvement, as we 
expect c to be much smaller than N. The constructions in this paper match the 
best previously known schemes in this model in terms of the alphabet size that 
is required to achieve a certain level of traceability for a given codeword length, 
and exceed all earlier schemes in the speed with which they trace (at least) one 
traitor. 

We also consider the IPP tracing algorithm (following the terminology 
in [23]). The IPP algorithm identifies all coalitions capable of making a pirate 
and looks for a common member(s) amongst these coalitions. Hence, the IPP 
property seems to be a more fundamental traceability property. In general this 
algorithm runs in time 0(crN‘^), where r is the length of each codeword, and 
hence is even less efficient than the TA algorithm. However, there are two good 
reasons to be interested in IPP codes. First, the extra computational burden of 
the IPP algorithm has led to the question (see [37]) of whether IPP schemes 
may beat TA schemes in other respects, namely, in terms of the number of code- 
words for a fixed set of parameters. We provide evidence that for schemes with 
enough structure to enable efficient tracing algorithms, increasing the number of 
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codewords causes tracing to fail with both the TA and IPP algorithms. Hence, 
IPP codes do not appear to yield efficiency improvements in this respect. Sec- 
ondly, as part of the IPP tracing process, additional valuable piracy information 
is amassed, namely, a list of all coalitions capable of creating the pirate in ques- 
tion. Such a list is not a by-product of the TA algorithm, but is a useful part 
of a security audit. We show that when error-correcting codes are used to con- 
struct TA traceability codes (which are also IPP codes, by a result in [37]), list 
decoding techniques can be used to construct new algorithms for finding all such 
coalitions. We give an algorithm that is more efficient than the brute force ap- 
proach of the IPP algorithm of evaluating each coalition for its ability to create 
the pirate, thereby answering an open question in [37]. 

This paper gives the first applications of list decoding to the traitor trac- 
ing problem in the above model, although Zane [48] uses such techniques to 
address the related problem of watermark detection. (See Section 1.1 below for 
a discussion of this, and other, related work.) These list decoding techniques 
are receiving wide attention in the coding theory community, and improvements 
and generalizations are being rapidly produced. We believe that in this paper 
we have merely scratched the surface of the potential applications of decoding 
techniques to traceability. In the last section we discuss the use of other decoding 
methods when additional information is known about the traitors or how they 
operate, giving directions for future work in this area. 



Overview. Section 1.1 covers related work on traceability and broadcast en- 
cryption and Section 2 covers the necessary background on traceability and cod- 
ing theory. Section 3 describes how to construct efficient traceability schemes. 
Section 4 considers the relationship between TA and IPP traceability schemes, 
providing justification for our restriction to the TA case, and raising some ques- 
tions concerning the relationship between TA and IPP for linear codes. Section 5 
shows how codes of sufficiently large minimum distance enable a more efficient 
algorithm for finding all coalitions of traitors. A discussion of other potential 
applications of coding theoretic ideas and techniques to traceability questions is 
given in Section 6. 



1.1 Related Work 

The phrase traitor tracing is coined in [9] (see also the extended version [10]). 
In traceability schemes, users are each given an ordered (as in [9,7,15,37], for 
example), or unordered (as in [40], for example) set of keys. 

In [6] (see also the revised version [7]), methods for creating TA traceability 
codes are given for the purpose of fingerprinting digital data. Lower bounds and 
additional constructions of TA traceability schemes are given in [40] , while lower 
bounds are also proven in [27,26] . In addition, [26] provides a tracing algorithm 
for schemes in [27]. 

The problem of combining broadcast encryption and traceability is studied 
in [41,16,29,46]. 
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Variations on the models of [10,7] have been studied in recent years. Dynamic 
models (here we study a static model), in which it is possible to get additional 
evidence of piracy in order to “test” traitor guesses, are studied in [15,3,33]. 
A public-key traitor tracing scheme is given in [5]. One of the nice properties 
of the scheme in [5] is that it is possible to identify all traitors. We note that 
although our algorithms in Sect. 3 can only guarantee the identification of one 
traitor, they do so in significantly faster time (polynomial in clogV, versus 
O(iVlog^iVloglogA^) in [5], with N the number of codewords and c the maximum 
coalition size). 

In [31,11], ways in which accountability can be added to the model are dis- 
cussed. For example, to improve upon the strength of the deterrent, in [11] com- 
mitting piracy efficiently necessitates revealing sensitive information. In [17], a 
system in which pirate pay-TV decoders can only work for short periods of time 
is presented. As noted in [17], traceability can be a useful addition to a long-lived 
broadcast encryption scheme. If keys are allocated to smartcards in such a way 
as to ensure some traceability, it is possible to keep a list of traitor smartcards 
over time. If the smartcard of one particular user appears on the list frequently 
despite many smartcard refreshments (i.e., key changes) this mounting evidence 
makes it increasingly likely that the user is actually guilty, and not simply a 
victim of smartcard theft. Hence, as long as traceability schemes are efficient, 
they can quickly yield useful information during system audits. 

Recently, the identifiable parent property (IPP) tracing algorithm has gar- 
nered attention [23,2,37] (also, very similar ideas are studied in [39]). In [23], a 
combinatorial characterization of 2-IPP schemes is presented. Additional con- 
structions of and bounds for IPP schemes appear in [2,37]. 

A coding theoretic approach is taken in [25] to study the related problem 
of blacklisting users in a broadcast encryption scheme, but that paper does not 
address the question of tracing. 

Our approach takes advantage of recent powerful list decoding methods, 
which originated with the work of Sudan [42]. In list decoding the input is a 
received word and the output is the list of all codewords within a given Ham- 
ming distance of the received word. Sudan’s results by themselves are not strong 
enough to be applicable in the setting in which the TA algorithm succeeds in 
finding traitors (as opposed to identifying probable traitors), since the decoding 
procedure in [42] is not capable of correcting enough errors in the code. However, 
Sudan’s work has recently been extended to enable it to efficiently correct more 
errors; i.e., it extends the radius of the Hamming ball around the received word 
in which it can find all the codewords in time polynomial in the length of the 
codewords. The improvements in [19] are precisely sufficient to be applicable to 
the setting where the TA algorithm succeeds. An additional advantage of this 
method is that it gives a list containing one or more traitors, rather than only 
one. Efficient list decoding algorithms now exist for Reed-Solomon codes, more 
general algebraic geometry codes, and some concatenated codes. 

List decoding techniques are applied to the problem of watermarking in [48] . 
Whereas in traceability schemes each user has a unique codeword, in the wa- 
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termarking scenario each user needs to be given the same “document” V, taken 
in [48] to be a vector of real numbers between 0 and 1. To prevent users from 
distributing pirated copies of V, each user is given a distinct, slightly modified 
“watermarked” version of V. The CKLS media watermarking scheme [8] is mod- 
ified in [48] so that the watermarks are chosen from a set of randomly generated 
CKLS codes according to a Reed-Solomon code. Given a suspected pirate copy 
of V, the results of [42] on list decoding can then be used to identify one or more 
traitors. 

Here, we consider the related question of traceability schemes, and we apply 
list decoding results for algebraic geometry codes and certain concatenated codes 
in addition to Reed-Solomon codes. In [48], Reed-Solomon codes are used to 
obtain vectors of real numbers between 0 and 1 to serve as a watermark, while 
here the error-correcting codes themselves are the traceability schemes. 

We note that algebraic geometry codes appear to have been under-utilized 
in cryptological applications. For example, the results of [34] can be used to give 
better explicit examples of c-frameproof codes than those obtained in [7]. The 
codes constructed in [34] are concatenated codes (see below) where the outer 
code is an algebraic geometry code coming from a Hermitian curve, while those 
used in [7] come from pseudo-random graphs (see [1]). 

2 Background on Codes and Traceability 

In this section we give definitions, notation, and background on codes, traceabil- 
ity, and the decoding techniques that form the basis for our tracing algorithms. 



2.1 Definitions and Notation 

A code C of length r is a subset of Q”, where Q is a finite alphabet. The elements 
of C are called codewords; each codeword has the form x = {xi, - • ■ ,Xr), where 
Xi G Q for 1 < i < r. Subsets of C will be called eoalitions. 

For any coalition Cn C C, we define the set of descendants of Cn, denoted 

desc(C„)b). 

desc(Co) = {w G : Wi G {xi : x G Cq}, for all 1 < i < r} . 

The set desc(Co) consists of the r-tuples that could be produced by the coalition 
Co- 

We define desCc(C) to be the set of all x G for which there exists a 
coalition Cq of size at most c such that x G desc(Co). In other words, desCc(C) 
consists of the r-tuples that could be produced by a coalition of size at most c. 
For x,g G Q”, let I(x,g) = {i : Xi = y*}. 

Definition 1. A code C is a c-TA (traceability) code if for all eoalitions Ci of 
size at most c, if w G desc(C'i) then there exists x G Ci such that |/(a;,w)| > 
|/(z, w)\ for all z G C — Ci. 
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In other words, C is a c-TA code if, whenever a coalition of size at most c 
produces a pirate word w, there is an element of the coalition which is closer to 
w than any codeword not in the coalition. 

Codes with the identifiable parent property (IPP) are another type of trace- 
ability code. 

Definition 2. A code C is a c-IPP code if for all w G desCc(C), the intersection 
of the coalitions Ci of size at most c such that w G desc(C'i) is nonempty. 

Suppose C is a code of length r. The (Plamming) distance between two ele- 
ments X and y of is r — \I{x,y)\. The minimum distance of the code C is the 
smallest distance between distinct codewords of C. 

If C is a c-IPP code and w G desCc(C), then the traitors that can produce 
the pirate w are the codewords that lie in all coalitions Ci of size at most c such 
that w G desc(Ci). 

When implementing one of the traceability codes just described, one ran- 
domly chooses a set of symbols {s(i,y)} with i G {!,... ,r} and y in the al- 
phabet Q, and the collection of symbols corresponding to a given user is deter- 
mined by the codeword associated with that user. For example, if the codeword 
X = {xi, . . . ,Xr) is associated with user u, then the set of symbols associated 
with user uis Su = {s(i,xi)) • • ■ ) S(r,a;,,)}- It is Su, not x, that the user stores (e.g., 
Su is embedded in the user’s CD or smartcard). The encryption step makes the 
model of pirate behavior that we consider reasonable. Since the symbols are 
generated randomly it is essentially impossible to guess a symbol, and hence a 
coalition is only able to form a pirate out of its pooled collection of symbols. In 
other words, moving from codewords to symbols thwarts algebraic attacks (such 
as, for example, the attack on [27] found in [41,5]). Although a coalition may 
be able to write down any codeword (this information may be public), it can 
only generate the symbol associated with an entry in the codeword if there is a 
coalition member that agrees with the codeword in that position. 



2.2 Background Traceability Results 

The following result, which is Lemma 1.3 of [37], is very useful for showing that 
a code is c-IPP. 

Lemma 1. ([37], Lemma 1.3) Every c-TA code is a c-IPP code. 

As shown in [37], there are c-IPP codes that are not c-TA. We give a simple 
example of a 2-IPP code that is not 2-TA. 

Example 1. Let U\ = (0,0,1), U 2 = (1,0,0), and M 3 = (2,0,0). The code 
{ui,M 2 ,M 3 } is clearly 2-IPP, since the first entry of a pirate determines a 
traitor. The coalition {mi,M 2 } can produce the pirate w = (0,0,0). However, 
]/(mi,m;)1 = \I{u 2 ,w)\ = \I{u 3 ,w)\ = 2, so the code is not 2-TA. 




Efficient Traitor Tracing Algorithms Using List Decoding 181 



Note that for c-IPP codes, traitor tracing is roughly an 0((^)) process, where 
iV is the total number of codewords in the code. A traitor tracing algorithm for 
a c-TA code takes as input awe desCc(C) and outputs a codeword x such that 
\I{x, rc)| is largest. Hence for c-TA codes, tracing is an 0{N) process, in general. 

The next result, which is proved in [37] (see Theorem 4.4 of that paper; see 
also [9] and [10]), shows that for codes with large enough minimum distance the 
TA algorithm suffices, and consists of finding codewords within distance r — ^ 
from the pirate. Further, all codewords within this distance will be traitors. 

Theorem 1. ([37], Theorem 4.4) Suppose C is a code of length r, c is a positive 
integer, and the minimum distance d of C satisfies d > r — . Then 

(i) C is a c-TA code; 

(ii) if Co is a coalition of size at most c, and w e desc(Co), then: 

(a) there exists an element of Cq within distance r—^ ofw, and 

(b ) every codeword within distance r — ^ of w is in the coalition Co ■ 

2.3 Linear Codes 

Linear codes are a very important class of codes. We will say that a code of 
length r is linear, or linear over Fq, if the alphabet is a finite field Fq and the 
code is a linear subspace of the vector space Fq. The dimension of the code is its 
dimension as a vector space. If (7 is a linear code over Fq of dimension k, then 
\C\=qK 

Reed-Solomon codes are among the most widely-used linear codes, with many 
useful applications (e.g., compact disks). To obtain a Reed-Solomon code of 
length r and dimension k over the finite field F]j, fix r distinct elements a\, . . . , 
of Fq. The codewords are exactly the r-tuples (/(oi), . . . , /(ar)) as / runs over 
(the zero polynomial and) all polynomials of degree less than k in Fq[x]. Note 
that a basis for the code over Fq is 

{(1,... ,l),(ai,... ,ar),{al,... ,al),... , (a^"\ . . . ,a^"^)} ■ 

Since two distinct polynomials of degree less than k agree on at most k — 1 
points, the minimum distance of the code is r — fc -I- 1. 

A useful generalization of Reed-Solomon codes are algebraic geometry (AG) 
codes (see, for example, [18,38,44]). The linear codes with the “best” known 
parameters asymptotically are AG codes [45]. One advantage of AG codes is 
that they are not, in general, bound by the restriction that r < g, as was the 
case for the Reed-Solomon codes above. Being freed of this constraint allows us 
to have a smaller alphabet (and in applications, fewer keys), for given choices 
of the other parameters. Hermitian codes, coming from Hermitian curves, are 
examples of AG codes that have nice properties and can be defined explicitly. For 
those familiar with the below terminology (such knowledge is not essential for 
appreciating the results of this paper), we note that for our purposes it suffices 
to consider the one-point codes C'x(’P,^7b) which can be defined as follows. 
Start with a smooth, absolutely irreducible curve X of genus g defined over a 
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finite field Fq, a set V = {Pi, . . . , Pr} of r distinct F^-rational points on X, 
another F^-rational point Pq on X which is not in the set P, and an integer £. 
The codewords are then the r-tuples (/(Pi), . . . , /(P^)), where / runs over the 
rational functions on X whose only pole is Pq, where the multiplicity is at most 
£. li 2g — 2 < £ < r, this code has dimension £ + 1 — g and minimum distance 
at least r — £. Reed-Solomon codes can be viewed as algebraic geometry codes 
by taking X to be the projective line, P to be the set of points corresponding 
to the r chosen field elements, Pq to be the point at infinity, and £ = k — 1. 

Concatenated codes are codes which are “concatenated” from two other codes. 
When two linear codes are concatenated, the product of their lengths (resp., di- 
mensions, resp., minimum distances) is the length (resp., dimension, resp., mini- 
mum distance) of the (linear) concatenated code. There are linear concatenated 
codes for small alphabets which have good list decoding capabilities, i.e., a small 
list of possible codewords can be recovered even when a large percentage of the 
symbols are in error or have been erased [20]. 

We refer the reader to [18,28,38,44] for more information on coding theory. 



2.4 Decoding 

In the theory of error-correcting codes, a codeword is transmitted through a 
noisy channel and an element of Q” (i.e., a word) is received. The receiver (or 
decoder) then tries to determine as accurately as possible which codeword was 
transmitted. 

If d is the minimum distance of the code, then the receiver can “correct” 
t = [^^J errors; i.e., there is at most one codeword within distance t of the 
received word. The radius t is called the error- correction hound or the packing 
radius. Minimum- distance (or nearest-neighbor) decoding finds the closest code- 
word to the received word. In practice, minimum-distance decoding is very slow. 
In hounded- distance decoding, the decoder finds a codeword within a specified 
distance of the received word, if one exists. In the bounded-distance decoding 
decision problem, the inputs are a linear code over a given finite field, a received 
word, and a specified distance t, and the output is a yes or no answer to the 
question of whether there is a codeword within distance t of the received word. 
This decision problem is known to be NP-complete [4]. 

In list decoding, the goal is to output the list of all codewords within a spec- 
ified distance of the received word. In [42] and [43], Sudan gave the first effi- 
cient methods for list decoding that run in time polynomial in the length of 
the codewords. Since then, Sudan’s list decoding technique has been improved, 
generalized, and refined [35,36,19,20,21,22,24,30,32,47,12,13]. The runtimes for 
the steps of the algorithm have been improved, the number of errors that can be 
“corrected” has been increased, and the technique has been shown to be appli- 
cable to a larger class of codes. Sudan’s original algorithm is for Reed-Solomon 
codes. Other codes for which the techniques have been shown to apply include 
AG codes (for which the focus has been on Hermitian codes) and certain con- 
catenated codes (see [20] , where the “outer code” is a Reed-Solomon or AG code 
and the “inner code” is a Hadamard code). 




Efficient Traitor Tracing Algorithms Using List Decoding 183 



In erasure decoding, some positions of the received word are garbled or 
“erased”, and cannot be identified. In this case the decoder knows that errors 
occurred in those positions. In erasure- and- error decoding, the decoder receives a 
word with some erasures and some errors, and determines the transmitted word, 
or a list of possible transmitted words (given some appropriate bounds on the 
numbers of errors and erasures). 

In soft-decision decoding, instead of receiving a {hard-decision) word, the 
decoder receives a reliability matrix that states the probability that any given 
element of the alphabet was sent in any given position. Using this “soft” infor- 
mation, a soft-decision decoder outputs the most likely transmitted codeword(s). 



3 Efficient Tracing Algorithms via List Decoding 

In this section we show how the efficiency of the TA tracing algorithm can 
be greatly improved when the traceability scheme is based on certain error- 
correcting codes, and the tracing algorithm uses fast list decoding methods. 
What is an 0{N) process in general becomes a process that runs in time poly- 
nomial in clogA^. These constructions match the best previously known trace- 
ability schemes in this model in terms of the alphabet size that is required to 
support a given level of traceability and codeword length (roughly speaking, the 
alphabet size is 0{N ^ )). The following theorem describes constructions based 
on Reed-Solomon, algebraic geometry, and concatenated codes. One advantage 
of considering all three types of codes is that the appropriate code choice for the 
traceability scheme depends on the desired parameters. 

Theorem 2. (i) Let C be a Reed-Solomon code of length r and dimension 
k over a finite field Fq of size at most 2”. If c is an integer, c > 2, and 
r > c^{k—l), then C is a c-TA code and there is a traitor tracing algorithm 
that runs in time 0{r^^). If r = {l-\- S)c^{k — 1) then the algorithm runs in 
time O(j^). For r = 0{c^k), the runtime is 0(c^° logj® iV) . 

(ii) Let X he a nonsingular plane curve of genus g defined over a finite field Fq, 
V a set ofr distinct Fq-rational points on X, Pg an Fq-rational point on X 
which is not in V, and k an integer such that k > g — \. Let c be an integer 
such that c > 2 and r > c^{k-\- g — 1), assume that g < 2”, and assume the 
pre-processing described in [19] has occurred. Then the one-point AG code 
Cx{F, {k -\- g — l)Tb) is a c-TA code with a traitor tracing algorithm that 
runs in time polynomial in r. 

(Hi) If k and c are positive integers, q is a prime power, q > c^ > 4, and S 
is a real number such that 0 < (5 < ^ , then there exists an explicit 

linear c-TA code over the field Fq of length r = 0 ( (or length 
^ ~ 0( j 2 iog 2 (i/, 5 ) ) ) o,nd dimension k with a polynomial (inr) traitor tracing 
algorithm. 
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Proof, (i) Since C is a Reed-Solomon code, the minimum distance d satisfies 
d = r — k+1. The condition r > (fc — 1 ) is then equivalent to the condition 
d > r — r ! (? . By Theorem 1, (7 is a c-TA code and traitor tracing amounts 
to finding a codeword within distance r — r jc of the pirate. Theorem 12 
and Corollary 13 of [19] imply that if t > \/{k — l)r then all codewords 
within distance r — t of a given word can be listed in time and if 

= (1 + 5){k — l)r then the runtime is O(^). Taking t = rfc gives the 
desired results. (Note that k = log^ N.) 

(ii) The minimum distance d of the code satisfies d > r — k — g + 1 (see, for 
example. Theorem 10.6.3 of [28]). By our choice of c we have d > r — k — 
g + 1 > r — r j(? and r — r j c < r — \/r(k + g — 1). By Theorem 27 of [19], 
there exists an algorithm that runs in time polynomial in r that outputs 
the list of codewords of distance less than r — a/ r{k + g — 1) from a given 
word. Now apply Theorem 1. 

(iii) Theorems 7 and 8 and Corollaries 2 and 3 of [20] imply that there exists an 
explicit concatenated code over Fq of the correct length r and dimension 
k, with minimum distance d > (1 — ^)(1 — 6)r, with a polynomial time list 

decoding algorithm for e errors, as long as e < (1 — VS){q — l)r/< 7 . The 
condition S < implies that d > r — rjc^ and that the upper bound 

on the number of errors is satisfied when e < r — rjc. The result now follows 
from Theorem 1. □ 

We emphasize that further improvements in the runtime of list decoding 
algorithms are being rapidly produced. It seems that some of these results will 
bring the runtime down to 0(rlog^ r) for Reed-Solomon codes, at least in certain 
cases (see [12]). The list decoding algorithm in [19] for AG codes was improved 
in [47] (see Theorems 3.4 and 4.1), where an explicit runtime was also given. 

4 Comparative Analysis of TA and IPP Traceability 

The results in this section justify a focus on TA (as opposed to IPP) schemes. In 
this paper we have been using the additional structure provided by linear codes 
to construct schemes for which the TA tracing algorithm is efficient. We know by 
Lemma 1 that c-TA codes are also c-IPP codes. However the converse fails ([37]; 
see also Example 1 above) . If constructions of schemes for which the IPP tracing 
algorithm is efficient (i.e., significantly reduced from 0(('^)) time) are possible, 
it is reasonable to expect this to be accomplished by introducing an algebraic 
structure. Here we give evidence that doing so may enable the inherently more 
efficient TA algorithm to be used to identify traitors. Hence, it is unclear that 
c-IPP schemes yield any advantage over c-TA schemes in finding a traitor. 

First, we prove a necessary condition on Reed-Solomon codes, under which 
they yield c-TA set systems. This condition is that the minimum distance is 
greater than r — rjc?, where r is the length of the codewords. This result suggests 
a potential method for generating examples of schemes that are c-IPP but not 
c-TA, namely, decreasing the minimum distance. Next we demonstrate through 
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a family of counterexamples that in fact this approach does not work in general; 
when the minimum distance is r — r /c^ it is possible to find Reed-Solomon codes 
for which both the IPP and TA tracing algorithms fail. 

We recall that there is a natural way to produce unordered sets from the 
ordered sets that constitute the code: to a codeword x = {x\, . . . , Xr), associate 
the set x' = {(1, x\), ... , (r, Xr)}- We define TA and IPP set systems (as opposed 
to TA and IPP codes) in the natural way, with the noteworthy difference that a 
pirate unordered set consists of r elements such that each element is a member 
of some coalition member’s set. This is a generalization of our earlier definition 
because it is not necessary to have one element of the form (i,yi) for each i = 
1 T 

The following theorem is a partial converse of Theorem 1. 

Theorem 3. If c > 2 is an integer and C is a Reed-Solomon code of length r 
with minimum distance d < r — , then the set system corresponding to C is 

not a c-TA set system. 

Proof. As above, if u G C, write u' = {(1,Mi),... ,{r,Ur)} for the associated 
element of the set system. Choose a codeword v = (ui, . . . ,Vr) in C. We will 
show that a coalition of size at most c exists which does not contain v', but 
which can implicate v' . In other words, we will construct a pirate set w which 
can be created by a coalition {u{, . . . ,«(,} with 6 < c that does not contain v', 
but which satisfies \v' n w| > |u' n w| for every i. Let 5 = r — d=k — 1, where k 
is the dimension of the code C. By assumption, S > r j(? . 

First, assume c& < r. For i = 1, . . . , c, choose Ui G C, distinct from v, which 
agrees with v on the S positions (i — 1)5 + 1, . . . , i5. (To do this, simply find a 
polynomial hi of degree 5 which vanishes on the S field elements corresponding 
to these 6 positions, and let Ui be the codeword corresponding to the polynomial 
/ — hi, where / is the polynomial corresponding to v.) Notice that, since two 
distinct codewords can agree on at most S positions, each u[ contains at least 
r — cS elements which are not in v' or in u' for any j ^ i. Since r — c5 > 0 and 
c > 2, we have r — c5 > = |"y] — S. We can therefore form a pirate set w 

so that for every i, jui n rc| < 5 + (|"^] — 5) = [j] and \v' n w| = c5 > [j] . Thus 
the TA algorithm will mark v' as a traitor. 

If on the other hand cS > r, simply choose Ui,. . . ,uj as above, where j = 
[JJ < c, and choose uj+i yf u to agree with v on the last r — jS positions. The 
coalition {u [, . . . , u'j+i} can create v' as a pirate set. □ 

The previous theorem leaves open the question of whether Reed-Solomon 
codes with minimum distance at most r — might still have traceability when 
the IPP algorithm is used even though the TA algorithm may no longer correctly 
identify traitors. The following family of counterexamples illustrates that this is 
not generally the case. It gives examples of Reed-Solomon codes of length r and 
minimum distance r — rfe? which are not c-IPP. 

Theorem 4. Let s and c be positive integers with c > 2, and let p he a prime 
number greater than (? . For i = 1, . . . ,c, let Qi = {i — l)c. For i = 1, . . . ,c, if s 
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is not divisible by p, let gi{x) = — i; otherwise let gi{x) = x’^ + x — i. Let T 

be the set of roots of all the polynomials gi — aj . Let q be a sufficiently high 
power of p so that T is a subset of the finite field Fg. Then T consists of c^s 
distinct elements of Fg. Let C be the Reed- Solomon code in which the codewords 
are the evaluations at the elements of T of all polynomials over Fg of degree at 
most s. Then the dimension of the code C is s-\-l, the length r of the codewords 
is r = c^s, the minimum distance of C is r — rfe?, and C is not c-LPP. 

Proof. We first show that T consists of c^s distinct elements. Let hij = gt — aj. 
Then hij{x) — hmn{x) = — i — (j — l)c+m+ (n — l)c. If hij{x) — hmn{x) = 0, then 
m — i is divisible by c. Since m and i are both in the range 1, . . . , c, they must 
be equal. Thus (j — l)c = (n— l)c, and so j = n. Therefore the set {hij} consists 
of distinct polynomials of degree s, any two of which differ by a non-zero 
constant. Therefore no two can have a root in common. Further, the derivative 
of hij is sx^~^ if s is not divisible by p, and is 1 otherwise. In both cases this 
derivative is relatively prime to hij (in the first case, note that hij is always of 
the form cc^-|-(a non-zero constant), so it never has 0 as a root). Therefore all the 
roots of hij are simple. So T consists of c^s distinct elements, and it makes sense 
to consider the Reed-Solomon code defined by evaluating polynomials of degree 
at most s at the elements of T. The code clearly has the stated parameters. The 
two coalitions corresponding to the polynomials in the sets {ai,... ,Oc} and 
{gi, . . . , gc} are disjoint, and each coalition can produce the pirate word defined 
as follows: for each f) in T, the /3-th entry of the pirate word is gi{f3) = aj, for 
the unique i and j such that the equality holds. It follows that the code is not 
c-IPP. □ 



By evaluating the polynomials at subsets of T of size at least s -I- 1 (to ensure 
that k < r), we can take the length r to be anything between s-l- 1 and c^s. The 
resulting minimum distance r — s is then at most r — r . 

We remark that if s is not divisible by p, then we can always find a q that 
works which is a divisor of p‘^ . 

The results in this section lead to the following questions which, while pe- 
ripheral to the traitor tracing problem, are of independent interest. Is it the case 
that all Reed-Solomon codes of length r with minimum distance d < r — rje? 
are not c-IPP? It is easy to see that this is false for linear codes in general. 
For example, one-dimensional linear codes are always both c-IPP and c-TA, but 
can have d< r — rjef if they are not Reed-Solomon codes (for one-dimensional 
codes, the minimum distance d is the number of non-zero entries in the non-zero 
codewords; the codewords of distance less than d from the pirate lie in every 
coalition that can create the pirate). If the answer to the above question were 
yes, combining it with Theorem I would imply that all Reed-Solomon c-IPP 
codes are c-TA. We raise as an open question whether all linear c-IPP codes are 
c-TA. 
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5 Finding All Possible Coalitions 

In this section, we describe how a coding theoretic approach can be used to 
amass additional piracy information: a list of all coalitions that are capable of 
creating a given pirate. Such information is useful in two respects. It clears all 
codewords not appearing in any of these coalitions of involvement in constructing 
the pirate word, and it constitutes useful audit information that may be helpful 
in the prosecution of a traitor later on. The two algorithms of this section require 
only that the code have minimum distance greater than r — and therefore 
are applicable to the codes in Theorem 2. The algorithms are fast when fast 
list decoding techniques exist. In addition, we note that for every code meeting 
this minimum distance requirement and having fast list decoding, the algorithms 
enable the IPP traitor tracing algorithm [23,2,37] to run more efficiently (as that 
algorithm works by intersecting all coalitions that are capable of creating a given 
pirate word). 

At a high level, the first algorithm builds a “tree” from which all c-coalitions 
capable of constructing a pirate w can be extracted. At the root of the tree 
lie all codewords that we know must be in every such coalition. The children 
are then candidate codewords for the next member of the coalition. Branches of 
the tree are extended until the current coalition “covers” w (i.e., is capable of 
constructing w), or until it becomes clear that this is impossible (e.g., because 
the coalition is already of size c and still cannot create w) . In the latter case that 
“dead-end” coalition is discarded and other branches of the tree are explored. 
Before describing the algorithm in more detail, we introduce some of the ideas 
used. If S' is a subset of {!,... ,r} and s = jSj, define a map fs ■ ^ F^~‘^ 

by “forgetting” the entries in positions corresponding to elements of S. If C is a 
code, then the image code /s(C) is the punctured code, where we view the code 
C as having been punctured at the positions corresponding to the elements of 
S. If u is in fs{C), any codeword v such that fs{v) = m is called a lift of u to C. 

We say that C7 is a minimal c-coalition for w if \U\ < c, w G desc{U), but w 
is not in desc(U) for any proper subset V of U. To obtain all coalitions of size 
at most c that can create w from the minimal ones, append arbitrary elements 
of the code. 

Algorithm Sketch: 

Input: Integer c > 1, code C of length r and minimum distance greater than 
r — fa, pirate word w € desCc(C'). 

Output: A list of coalitions of size at most c that can create w, including all 
minimal c-coalitions for w. 

The basic steps of the algorithm are as follows: 

(i) Use list decoding to find all codewords u\,...,Ua G C (a < c) within 
distance r — r/coiw. Let S be the subset of {1, . . . ,r} on which w agrees 
with at least one of {ui, ... ,Ua\, and let s = [S']. Let r\ = r — s, c\ = c — a. 
Cl = fs{C), and = fs{w). (Thus Ci is the punctured code, r\ is its 
length, wi is the word which is the image of the pirate word under the 
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puncturing map, and ci is the number of coalition members still to be 
found.) If ri = 0, quit and output {mi, . . . , Ua\- Set i = 1. 

(ii) Use list decoding to find all codewords vn, . . . ,Vibi G Ci {bi < Ci) within 
distance ri—rijci of wi. (Note that the first time this is executed, the output 
is non-empty.) If this outputs the empty-set, exit to Step (iii). Otherwise, let 
Si be the subset of {1, . . . , r^} on which Wi agrees with Vib^, and let Sj = |S'i|. 
Let Ti+i = Ti- Si, Ci+i = Ci - 1, C\+i = fSiiCi), and Wi+i = fsi(wi). 

(iii) To create the coalitions to output, always start with ui, . . . ,Ua- Then add 
(a lift to C of) Wibi, V 2 b 2 j Emd so on. Continue until the list of codewords 
“covers” the pirate w. When this process succeeds or dead-ends (i.e., the 
current list does not yet cover w, but either we cannot find any codewords 
within the required distance ri —rijci of Wi, or we already have c codewords 
in our list), then move up the “tree” of u^’s (i.e., move back through the 
Vij ’s) to find the first unexplored branch and continue from there (repeating 
Step (ii) with a different Vij in place of Vib^)■ The algorithm terminates when 
all branches have been explored. 

Analysis of the Algorithm: 

The algorithm is correct because the output is clearly a list of coalitions of size 
at most c that can create the pirate, and includes each minimal c-coalition at 
least once. (In fact, it may list a coalition more than once.) Note that in Step 
(iii), all lifts of each Vij should be considered. By Theorem 1, u\,. . . ,Ua are in 
every coalition that can create w. In Step (ii), if di > Vi — Vijcl where di is 
the minimum distance of the punctured code Ci, then every coalition that can 
produce the original pirate w will contain some lift to the original code of some 
Vij . Moreover, if a lift to C of Vij is in some coalition that can create the original 
pirate w, then there exists a codeword within ri — rijci of Vij (by the pigeonhole 
principle), and the algorithm will proceed. If Step (ii) returns the empty-set, then 
the current path is a dead-end. Note that list decoding a punctured code and 
then lifting accomplishes the same thing as erasure-and-error decoding. When 
C satisfies any of the sets of conditions in Theorem 2, then Step (i) can be done 
efficiently (time polynomial in r). 

Note that the brute force method for finding all coalitions runs in time 
0{crN‘^), where N is the total number of codewords in the code (for each of 
the at most coalitions of size at most c, compare each of the r entries of the 
pirate to the corresponding entry of each member of the coalition). For Reed- 
Solomon codes with r = 0{c^k), this gives a runtime of 0(c^A° log A). 

Our second algorithm is to list decode to find all codewords ui,. . . ,Ua (1 < 
a < c) within distance r — r jc of the pirate (as in Step (i) above), and then use 
brute force to determine the remaining (at most) c — a members of the coalitions. 
When C is a Reed-Solomon code satisfying the conditions in Theorem 2(i) with 
r = 0{c^k), the dominant term in the runtime is log N). This is clearly 

an improvement over brute force alone, since a > 1. 
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6 Future Directions: Tracing with Extra Information 

In this section, we describe how other coding theoretic techniques may be ap- 
plied to the traitor tracing problem when additional information about traitor 
behavior is available. 

One possible approach to tracing traitors is to try to second-guess their strat- 
egy. For example, if you believe that one traitor has contributed more than the 
other members of the coalition to the pirate, you can apply bounded-distance 
decoding up to the error-correction bound to find such traitors very quickly. This 
might involve a “ringleader” or “scapegoat” scenario. If on the other hand you 
believe that all traitors contributed roughly equal amounts, then list decoding 
should be tried first. Traitors can be searched for in sequences of expanding 
Hamming balls around the pirate. These searches can be run in parallel or se- 
quentially. The runtime of bounded-distance decoding up to the error-correction 
bound for Reed-Solomon codes is at most quadratic in the length of the code- 
words. Note that [32] gives a fast algorithm for list decoding Reed-Solomon codes 
beyond the error-correction bound (also quadratic in the codeword length), but 
does not go as far as the Guruswami-Sudan algorithm. It therefore will not be 
guaranteed to find a traitor, but would quickly find a ringleader. 

In [19], list decoding is considered not just in the case of errors, but also 
in the case of erasures and errors (and another potentially useful case that is 
referred to as “decoding with uncertain receptions”). For concatenated codes, 
[20] also deals with the problem of decoding from errors and erasures. Building on 
[19], [24] presents a high-performance soft-decision list decoding algorithm. We 
believe that these results also have potential for use in traitor tracing problems, 
in cases where some additional information is known about the traitors or how 
they are operating. 

If one has information about the traitors or their modes of operation, one 
can build that information into a reliability matrix, and apply soft-decision de- 
coding algorithms to trace. For example, suppose we know that a traitor who 
contributed the first entry to the pirate contributed at least r/c entries to the 
pirate. One can use this information to construct a skewed reliability matrix. If 
the underlying code is a Reed-Solomon code over a finite field of size q, one can 
then apply the soft-decision algorithm in [24] to find such a “dominant” traitor. 
The channel that models this situation is a g-ary symmetric channel. The first 
column of the reliability matrix will have a 1 in the entry corresponding to the 
field element that occurs in the first position of the pirate, and O’s elsewhere. 
For j > 1, the jth column of the reliability matrix will have 1 — e in the entry 
corresponding to the field element in the jth entry of the pirate, and the other 
entries will all be where e < is chosen so as to optimize the soft-decision 
decoding algorithm in [24] . If one does not know which entry was contributed by 
the traitor who contributed the most, one possible search method is to choose 
entries at random from the pirate and apply the above strategy to search for 
traitors that contributed that entry. 

Erasure-and-error decoding may be useful in fingerprinting or watermarking 
scenarios, such as those presented in [6,7,15]. In one model, a coalition creates 
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a pirate copy of the digital content by leaving fixed all codeword entries where 
they all agree, and choosing the values of the remaining positions from Q U {?}, 
where Q is the alphabet. The ?’s can be viewed as erasures. 

7 Conclusion 

We have demonstrated that traitor tracing algorithms can be quite efficient 
when the construction of the traceability scheme is based on error-correcting 
codes and the method of tracing is based on fast list decoding algorithms. For 
the TA algorithm, traitors can be identified in time polynomial in r, where r 
is roughly c^log^A^, rather than in time 0{N). In addition, list decoding on 
successive punctured codes gives a method for identifying all possible traitor 
coalitions of size at most c more efficiently than a brute force search. This is 
quite useful because of the additional piracy information it represents, as well 
as for the efficiency improvements that it enables for another traitor tracing 
algorithm that has garnered interest recently, the IPP algorithm. We also give 
evidence for a close relationship between the TA and IPP properties, for linear 
codes. Finally, we suggest avenues for future research, including explorations of 
applications of soft-decision and erasure decoding techniques to traitor tracing 
in scenarios where additional information has been obtained about the traitors 
or their mode of operation. 
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Abstract. This paper describes truncated and impossible differential 
cryptanalysis of the 128-bit block cipher Camellia, which was proposed 
by NTT and Mitsubishi Electric Corporation. Our work improves on 
the best known truncated and impossible differential cryptanalysis. As 
a result, we show a nontrivial 9-round byte characteristic, which may 
lead to a possible attack of reduced-round version of Camellia without 
input/output whitening, FL or FL~^ in a chosen plain text scenario. 
Previously, only 6-round differentials were known, which may suggest 
a possible attack of Camellia reduced to 8-rounds. Moreover, we show 
a nontrivial 7-round impossible differential, whereas only a 5-round 
impossible differential was previously known. This cryptanalysis is ef- 
fective against general Feistel structures with round functions composed 
of S-D (Substitution and Diffusion) transformation. 

Keywords: Block Cipher Camellia, Truncated Differential Cryptanaly- 
sis, Impossible Differential Cryptanalysis 



1 Introduction 

Camellia is a 128-bit block cipher proposed by NTT and Mitsubishi Electric 
Corporation [1]. It was designed to withstand all known cryptanalytic attacks 
and to provide a sufficient headroom to allow its use over the next 10 — 20 years. 
Camellia supports 128-bit block size and 128-, 192-, and 256-bit key lengths, i.e. 
the same interface specifications as the Advanced Encryption Standard (AES). 
Camellia was proposed in response to the call for contributions from ISO/IEC 
JTC 1/SC27 with the aim of it being adopted as an international standard. 
Camellia was also submitted to NESSIE (New European Schemes for Signature, 
Integrity, and Encryption). Furthermore, Camellia was submitted to CRYP- 
TREC (CRYPTography Research & Evaluation Committee) in Japan and it is 
now being evaluated. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 193-207, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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Like E2 [5], which was submitted to AES, Camellia uses a combination 
of a Feistel structure and the SPN-(Substitution and Permutation Network)- 
structure, but it also includes new features such as the use of improved linear 
transformation in SPN-structures, the change of SPN-structures from three lay- 
ers into two, and the use of input/output whitening, FL and FL~^. The result 
is improved immunity against truncated differential cryptanalysis, which was 
applied successfully against reduced-round version of E2 by Matsui and Tokita 
[ 12 ]. 

Truncated differential cryptanalysis was introduced by Knudsen [4], as a 
generalization of differential crypanalysis [3]. He defined them as differentials 
where only a part of the differential can be predicted. The notion of truncated 
differentials as introduced by him is wide, but with a byte-oriented cipher such 
as E2 or Camellia, it is natural to study byte-wise differentials as truncated 
differentials. 

The initial analysis of the security of Camellia and its resistance to the trun- 
cated and impossible differential cryptanalysis is given in [1], [6]. They state 
that Camellia with more than 1 1 rounds is secure against truncated differential 
cryptanalysis, though they did not indicate the effective truncated differentials. 
Up to now, the effective cryptanalysis applicable to Camellia has been the higher 
order differential cryptanalysis proposed by Kawabata, et al.[7], which utilizes 
non-trivial 6-round higher order differentials, and the differential crypat analysis 
which utilizes a 7-round differential [2]. 

Our analysis improves on the best known truncated and impossible crypt- 
analysis against Camellia. Our cryptanalysis finds a nontrivial 9-round truncated 
differential, which may lead to a possible attack of Camellia reduced to 11-rounds 
without input/output whitening, FL, or FL~^ by a chosen plain text scenario. 
Moreover, we show a nontrivial 7-round impossible differential, whereas only a 
5-round impossible differentials were previously known. 

The contents of this paper are as follows. In Section 2, we describes the struc- 
tures of block ciphers, truncated differential probabilities, impossible differential 
cryptanalysis and the block cipher Camellia. In Section 3, we describe the previ- 
ous work on the security of block cipher Camellia. In Section 4, we cryptanalyze 
Camellia by truncated differential cryptanalysis. In Section 5, we cryptanalyze 
Camellia by impossible differential cryptanalysis. Section 6 concludes this paper. 



2 Preliminaries 

In this section, we describe the general structures of block ciphers, truncated dif- 
ferential probabilities, impossible differential cryptanalysis and the block cipher 
Camellia. 



2.1 Feistel Structures 

Associate with a function / : GF(2)" ^ GF(2)", a function D 2 nj{L, R) = 
(i? 0 f{L),L) for all L,R G GF(2)". D 2 nj is called the Feistel transformation 
associated with /. Furthermore, for functions /i, / 2 , ■ ■ ■ , fs ■ GF(2)” ^ GF(2)”, 
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define ipn{fi,f 2 , ■ ■ ■ , fs) = D 2 n,f,°- ■ ■°D 2 n,h°D 2 n,h- We call F(/i, /s, • ••,/«) = 
ipn{fi,f 2 , ■ ■ ■ ,fs) the s-round Feistel structure. At this time, we call the functions 
/i 5 / 2 , • • • ) /s the round functions of the Feistel structure F(/i, / 2 , • • • , /«). 



2.2 SPN-Structures [9] 

This structure consists of two kinds of layers: nonlinear layer and linear layer. 
Each layer has different features as follows. 

Nonlinear (Substitution) layer: This layer is composed of m parallel n-bit 
bijective nonlinear transformations. 

Linear (Diffusion) layer: This layer is composed of linear transformations 
over the field GF(2”) (especially in the case of E2 and Camellia, GF(2)), where 
inputs are transformed linearly to outputs per word (n-bits). 

Next for positive integer s, we define the s-layer SPN-structure that consists 
of s layers. First is a nonlinear layer, second is a linear layer, third is a nonlinear 
layer, • • • . 



2.3 Word Characteristics 

We define a word characteristic function y : GF(2")™ ^ GF(2)"‘, 
I — > by 



^ _ f 0 if Oi = 0 
* ( 1 otherwise. 

Hereafter, we call y(a) the word characteristic of a G GF(2")’”. Especially in 
the case of n = 8, we call y(a) the byte characteristic. 



2.4 Truncated Differential Probability 

Definition 1. Let Ax, Ay G GF(2”)"* denote the input and output differences 
of the function f , respectively. 



Ax = {Axi,Ax 2 ,- ■ ■ , Axm) 
Ay = {Ayi,Ay 2 , • • • , Ay.^) 



We define the input and output truncated differential {Sx,Sy) G (GF(2)’”)^ of 
the function f, where 



Sx = {Sxi,Sx 2 , ■ ■ ■ , 5xm) 
Sy= {Syi,Sy 2 , - ■ ■ ,Sym) 



by Sx = x{Ax),Sy = x(Ay). 
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Let pf{Sx,Sy) denote the transition probability of the truncated differential 
induced by function f. pf{Sx,6y) is defined on the truncated differential {5x,6y). 
Truncated differential probability pf{Sx,Sy) is defined by 

Pf{Sx,6y) = l/c ^ Pr(x G GF(2”)’”|/(a;) 0 /(a; 0 Z\a;) = Z\j/), 

x{Ax)=Sx,x(Ay)=Sy 

where c is the number of Ax that satisfy x(^^) = 

2.5 Block Cipher Camellia 

Fig. 1 shows the entire structure of Camellia. Fig. 2 shows its round functions, 
and Fig. 3. shows FL-function and FL“^-function. 



M (12H) 





Fig. 1. Block cipher Camellia 



3 Previous Security Evaluation of Camellia 

3.1 Security Evaluation against Truncated Differential 
Cryptanalysis [10] 

In [10], an algorithm to search for the effective truncated differentials of Feistel 
ciphers was proposed. This search algorithm consists of recursive procedures. 
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M64) 




Z 8(8) 
Z7(8) 

z’m 

Z5(8) 

Z4(8) 

Z’3(8) 

Z’2(8) 

Z’l(8) 



Fig. 2. F function of Camellia 

-^( 64 ) ^( 64 ) 




Fig. 3. FL and FL ^ functions of Camellia 



Feistel ciphers are assumed to have S rounds and input and output block size is 
2m bits. 

Algorithm 1 [11] 

Let AX^'^\ G GF(2)™ be the input and output truncated difference of 

the r-th round functions fr- {AL, AR) is the truncated difference of the plaintext. 
Let Pr{{AX^^\ AX^^^)\{AL, AR))) be the r-round truncated differential proba- 
bilities. 

1. Calculate all the truncated differential probabilities pf{Sx,Sy) of the round 
function f for all truncated differentials {Sx,6y) and save these probabilities 
in memory. 

2. Select and fix (AL,AR). Vr{{AX^^\ AX^"^'>)\{AL, AR)) should be initialized 
as 1 if {AX^^\ AX^^'^) = (AL,AR), otherwise as 0. 
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3. Utilizing the values of pf{Sx,Sy), calculate Pr((Z\X('’+^\ 

{AL,AR)) for all from all values of Pr((Z\X^’’\ 

AX^'^^^'>)\{AL, AR)), and save in memory. Repeat thisfromr = 1 toS. and 
save the most effective truncated differential probability in memory, where 
’most effective’ means that the ratio of the obtained probability to the aver- 
age probability is the maximum. 

4 . Repeat 2-3 for every {AR, AL). 

5. return the most effective truncated differential probability. 

Using this procedure, we can search for all truncated differentials that lead to 
possible attacks on reduced-round version of Camellia. We cannot find any such 
truncated differentials for Camellia with more than 6-rounds by this algorithm. 
The best 6-round truncated differential that leads to possible attacks on reduced- 
round version of Camellia is shown in Fig. 4. 
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Fig. 4. 6-round truncated differential of Camellia 



By this path, total probability p ~ 2 whereas the average probability, 
which can be obtained when entire round function is a random permutation, is 
2 - 96 . 

This evaluation is accurate if we take the ideal approximation model as is 
done in [10]. We note that this model is not always appropriate for Camellia, 
especially because the round function of Camellia is a 2-layer SPN, i.e. S-D 
(Substitution and Diffusion), not a 3-layer SPN, i.e. S-D-S. In [1] and [6], they 
upper-bounded the truncated differential probabilities considering this gap, and 
no effective truncated differentials for Camellia with more than 7-rounds (with- 
out input/output whitening, FL or FL~^) are known. 
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4 Truncated Differential Cryptanalysis of Reduced-Round 
Version of Camellia without Input/Output Whitening, 
FL or FL~^ 

This section indicates the truncated differentials that are effective in the crypt- 
analysis of reduced version of Camellia. These truncated differences cannot be 
found by the algorithm described in the previous section. We define notation in 
Fig. 5. 




Fig. 5. Notation 



First we analyze the round function of Camellia. P-function composing F- 
function is denoted as follows. 

GF(2®)® ^ GF(2®)® 

{zi,Z2, Z 3 , Z4, Z 5 , Ze, zr, Zs) ^ (z[,Z 2 , 4 , 4 . 4 : 4 ) 4 )- 

This transformation can be expressed by linear transformations represented 
by matrix P. 
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where 
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This transformation induces the transformation of the difference as follows. 



/ Azi\ 

Az2 

Az 3 

Az 4 

Az 5 

Aze 

Az7 

\AzsJ 



/Az[\ 

Az'2 

Az's 

Az'4 

Az' 

Az'y 

\^4/ 



/ Azi\ 
Az 2 
Az 3 
Az 4 
Az 5 
Azg 
Az 7 
\AzsJ 



Next we consider the truncated differentials effective for truncated differential 
cryptanalysis. 

When Az\,Az2 yf 0, Az3 = Az4 = Az^ = Azq = Azr = Azs = 0, then 



/ Azi\ 
Z\Z2 
0 
0 
0 
0 
0 



V 0 J 



( \ 

Azi 0 Az2 

Azi 0 Az2 
Az 2 

Azi 0 Az2 
Az2 
0 

y Azi 0 Z\ 2:2 ) 



When Azi yf Az2 , this can be expressed in terms of byte characteristics as 



( 11000000 ) ( 11111101 ). 
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In this case, this transition probability (truncated differential probability) pi ~ 

1 . 

Utilizing the value of 

/O 1 1 1 0 1 1 l\ 

10111011 
11011101 
1 _ 1110 1110 

^ “ 11001011 ’ 

01101101 
00111110 
\1 0 0 1 0 1 1 1 / 

when Azi,Az 2 yf 0,Z\zi yf Az 2 ,Azs = Az 4 = Az^ = Azi (B Az 2 ,Azq = 
Azi,Az 7 = 0, Azs = Az 2 , we obtain 

/ /\Z2\ 

Azi 
0 
0 
0 
0 
0 

V 0 

which can be expressed in terms of byte characteristics as 

( 11111101 ) ( 11000000 ). 

This transition probability (truncated differential probability) p 2 — 

Utilizing these two transition probabilities, we can obtain a 9-round truncated 
differential that contains two different paths as in Fig. 6. 

In total, the first transition probability is approximately 2“^^^ (see the eval- 
uation in Appendix). 

Similarly, we consider the other path, which is as effective as the first one. 
In total, this transition probability is also 2“^^^ (see also the evaluation in Ap- 
pendix) . 

Summing the two probabilities, therefore, the truncated differential proba- 
bility of 

Pr(x(AL') = (11000000), x(^i?') = (00000000)| 
x(AL) = (00000000), x(^i?) = (11000000)) ~ 2.0 X 2-“^ 

which is approximately twice as large as the average value 2“^^^. 

Our search has not found any truncated differential more effective than this 
for 9-round Camellia. 
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I i ^ 

00000000 11000000 00000000 11000000 



Fig. 6. 9-round byte characteristic of Camellia 



5 Impossible Differential Cryptanalysis of Reduced- 
Round Version of Camellia without Input/Output 
Whitening, FL or FL~^ 

5.1 Impossible Differential Cryptanalysis 

Impossible differential means the differential that holds with probability 0, or 
the differential that does not exist. Using such an impossible differential, it is 
possible to narrow down the subkey candidates. It is known that there is at 
least one 5-round impossible differential in any Feistel structure with bijective 
round functions. Since Camellia uses the Feistel structure with FL and FL~^ 
inserted between every 6-rounds and the round function is bijective, Camellia 
has 5-round impossible differentials. 



5.2 Impossible Differential Cryptanalysis of Reduced Camellia 

In [1], they state that they have not found impossible differentials for more than 
5 rounds. In this subsection, we indicate one impossible differential of a 7-round 
reduced-round version of Camellia without input/output whitening, FL and 
FL~^ as shown in Fig. 7. 
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Fig. 7. 7-round impossible differential of Camellia 



In this figure, we consider the byte characteristic 

( 0000000010000000 ) ( 1000000000000000 ), 

In this case, we can prove that this is an impossible differential as follows. 
First we assume y(/\F) = (00000000), v(Z\i?) = (10000000), v(Z\i?') = 
(00000000), x(Z\F') = (10000000). 

This assumption implies that 

Axi = Ayi = PAyi = 0, Ax2 = AR, Ax^ = Ay^ = PAy-j = 0, Axq = AL' . 
From 

PAyi = PAy2 © PAye, 

it follows that 

Ayi = Ay2 © Aye, 

which implies 



x{Axi) = 



(10000000) if Ay2 ^ Aye 
(00000000) otherwise. 



From the definition of P, 



x{Aye) = x(Z\x3) = (11101001), 
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where Ay^ ^ 0 because Ax^ ^ 0 follows from Ay 2 7^ 0 
Similarly, 

x{Ay^) = X(Z\X5) = (11101001), 

where Ay^ yf 0 because Ax^ yf 0 follows from Ay^ yf 0. 
Since x(Zii?) = (10000000), it holds that 



x{Axa © AR) 



(10000000) if Ax 4 + AR 
(00000000) otherwise. 



Similarly, since x(^L') = (10000000), it holds that 



x{Ax4 © AL') 



(10000000) if Ax 4 + Ah' 
(00000000) otherwise. 



From Fig. 7, it holds that 

PAys = Ax 4 © AR, PAy^ = Ax 4 © AL' , 

however, there is no (t, s) G (GF(2®)®)^ such that x(^) = (11101001), x(s) = 
(10000000), t yf 0 and Pt = s. 

Thus, the truncated differential represented by 



( 0000000010000000 ) ( 1000000000000000 ) 



is impossible. 

6 Conclusion 

This paper evaluated the security of the block cipher Camellia against truncated 
and impossible differential cryptanalysis. We introduced a nontrivial 9-round 
truncated differential that leads to a possible attack of reduced-round version 
of Camellia without input/output whitening, FL or FL~^ in a chosen plain 
text scenario. Prior studies showed only a 6-round truncated differential for a 
possible attack against 8-round Camellia. Moreover, we showed a nontrivial 7- 
round impossible differential, whereas only a 5-round impossible differentials 
were previously known. 

Acknowledgment. We would like to thank Shiho Moriai and the anonymous 
reviewers for their helpful comments. 
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Appendix: Evaluation of Truncated Differential Probability 
of Camellia 

First we evaluate the transition probability of the first path in Fig. 6. 



Pr(x(PAy(l)) = (00000000) |x(Aa;(l)) = (00000000)) = 1 
Pr(x(Aa;(2)) = (11000000)|x(P4iy(l)) = (00000000), 

X(4\P) = (11000000)) = 1 
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Vr(x{PAy{2)) = (11111101)|x(/ia;(2)) = (11000000)) ~ 1 
Pr(x(Zia;(3)) = (11111101)|x(PZiy(2)) = (11111101)), 

X{^L) = ( 00000000 )) = 1 

Pr{x{PAy{3)) = (11000000) |x(2ia;(3)) = (11111101)) ~ 2~*° 
Pr{x{Ax{A)) = (11000000) |x(/ia;(2)) = x{PAy{3)) = (11000000)) ~ 1 
Pr{x{PAy{4)) = (11111101)|x(/ia;(4)) = (11000000)) ~ 1 
Pr(x(^a:(5)) = (00000000))|x(2ia;(4)) = x{Ax{2)) = (11000000), 

x{Ax{\)) = (00000000)) 

= Pr(Z\y(4) = Ay{2)\x{Ax{4)) = x{Ax{2)) = (11000000)) ~ 2"^® 
Pr{x{PAy{5)) = (00000000) |x(4ia;(5)) = (00000000)) = 1 
Pr(x(/ia;(6)) = (11000000) |x(P-4y(5)) = (00000000), 

x{Ax{A)) = (11000000)) = 1 
Pr{x{PAy{6)) = (11111101)|x(4ia;(6)) = (11000000)) ~ 1 
Pr{x{Ax{7)) = (11111101)|x(P4i2/(6)) = (11111101), 

x{Ax{b)) = (00000000)) = 1 

Pr{x{PAy{7)) = (11000000)|x(2ia;(7)) = (11111101)) ~ 2"'*° 
Pr(x(2ia;(8)) = (11000000) |x(P4iy(7)) = (11000000), 

x{Ax{&)) = (11000000)) ~ 1 
Pr(x(P/\y(8)) = (11111101)|x(2ia;(8)) = (11000000)) ~ 1 
Pr(x(4\a;(9)) = (00000000)|x(2ia;(8)) = x{Ax{6)) = (11000000), 

x{Ax{b) = ( 00000000 )) 

= Pr{x{PAy{8) © Ax{7)) = (00000000)| 
x{Ax{8)) = x{Ax{&)) = (11000000)) 

= Pr(Zly(8) = Ay{6)\x{Ax{8)) = x{Ax{&)) = (11000000)) ~ 2"^® 
Pr{x{PAy{9)) = (00000000) |x(4ia;(9)) = (00000000)) = 1 
Pr(x(/ia;(10)) = (11000000) |x(P-4y(9)) = (00000000), 

x{Ax{8)) = (11000000)) = 1 



In total, the transition probability is approximately 
Similarly, we consider the other path in Fig. 6, which is as effective as 
first one. 



PT{x{PAy{l)) = (00000000)|x(2\a;(l)) = (00000000)) = 1 
Pt{x{Ax{2)) = (11000000)|x(P2it/(l)) = (00000000), 

x{AR) = (11000000)) = 1 
Pr{xiPAy{2)) = (11111101)|x(2\a;(2)) = (11000000)) ~ 1 
Pr(x(2\a;(3)) = (11111101)|x(P2\j/(2)) = (11111101)), 

x{AR) = (00000000)) = 1 
Pr(x(P2ii/(3)) = (llllllll)|x(2\a;(3)) = (11111101)) ~ 1 
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Pr(x(Zia;(4)) = (llllllll)|x(Zia:(2)) = (11000000), 

x(PZ\y(3)) = (11111111)) ~ 1 
Pr(x(P/it/(4)) = (11111101)|x(4\a;(4)) = (11111111)) ~ 2" 
Pr(x(4\*(5)) = (11111101))|x(P/it/(4)) = x(^®(3)) = (11111101)) ~ 1 
Pr(x(P2ij/(5)) = (11111101)|x(4\a;(5)) = (11111111)) ~ 1 
Pr(x(4\a;(6)) = (llllllll)|x(P4\j/(5)) = (11111111), 

x(/ia;(4)) = (11111111)) ~ 1 

Pr(x(P2iy(6)) = (11111101)|x(4ia;(6)) = (11111111)) ~ 2" 
Pr(P"^Z\x(7) G{xG GF(2®)®|x(a:) = (11000000)}| 
x(Zij/(2)) = (11000000), x(Ziy(4)) = x(4ij/(6)) = (11111111), 
x(PZ\y(4)) = x(P4\y(6)) = (11111101)) 

= Pr(Z\y(4) © Ay(Q) G {x G GF(2®)®|x(a:) = (11000000)}| 
x(Zij/(2)) = (11000000), x(2iy(4)) = x(4ij/(6)) = (11111111), 

x(PZ\y(4)) = x(P4iy(6)) = (11111101)) ~ 2- 
Pr(x(P2ii/(7)) = (llllllll)|x(4\a;(7)) = (11111101)) ~ 1 
Pr(x(2ia;(8) = (11000000)|x(4\P) = (11000000), 
x(4\j/(3)) = x(4i?/(5)) = X(4iy(7)) = (11111101)) 

= Pr(Z\y(3) © Ay{5) © Ay(7) G P~^{x G GF(2®)®| 

X{x) = (11000000)}! 

x(4\j/(3)) = x( 4\2/(5)) = X(4iy(7)) = (11111101)) ~ 2" 
Pr(x(P2ij/(8)) = (11111101)|x(4ia;(8)) = (11000000)) ~ 1 
Pr(x(4\a;(9)) = (00000000)|x(4\i/(8)) = (11000000), 
P'^Ax{7) G{xG GF(2®)®|x(a:) = (11000000)}) ~ 2” 
Pr(x(P2ij/(9)) = (00000000)|x(4ia;(9)) = (00000000)) = 1 
Pr(x(/ia;(10)) = (11000000)|x(P2it/(9)) = (00000000), 

Xi^x{8) = ( 11000000 )) = 1 



In total, this transition probability is also approximately 2 
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Abstract. With chosen-IV chosen texts, David Wagner has analyzed 
the multiple modes of operation proposed by Eli Biham in ESE’98. 
However, his method is too unrealistic. We use only known-IV chosen 
texts to attack many triple modes of operation which are combined with 
cascade operations. 123 triple modes are analyzed with complexities less 
than E. Biham’s results. Our work shows that the securities of many 
triple modes decrease when the initial values are exposed. 

Keywords: Block cipher, mode of operation for DES, Triple DES 



1 Introduction 

Since the appearance of DES [7], several attacks on DES and its variants have 
been suggested. E. Biham and Adi Shamir introduced differential cryptanaly- 
sis of DES in 1991 and 1992 [3,4]. Mitsuru Matsui analyzed DES with linear 
cryptanalysis in 1993 and 1994 [5,6]. Differential cryptanalysis and linear crypt- 
analysis are the most powerful methods for attacking DES. These attacks have 
led many people in the cryptographic community to suggest stronger replace- 
ments for DES, which can be either new cryptosystems or new modes of oper- 
ation for the DES. So triple DES instead of DES has been used and applied to 
the modes of operation for DES — ECB, CBC, OFB, and CEB. Triple DES is 
even more secure but slower than DES. This reason has led to consideration of 
multiple modes of operation combined from several consecutive applications of 
single modes. In hardware implementation, the multiple modes have an advan- 
tage that their speed is the same as of single modes because the single modes 
can be pipelined. In particular, the triple modes were expected to be as secure 
as triple DES although they have DES as a building block. 
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In 1994 and 1996, E. Biham analyzed many triple modes of operation with 
chosen plaintexts and chosen ciphertexts, and showed that every mode con- 
sidered except the the triple ECB mode is not much more secure than sin- 
gle modes [1,2]. Considering dictionary attacks or matching-ciphertext attacks, 
the commonly-used triple-DES-ECB mode when used with some outer chaining 
technique is not much more secure than any single modes. To solve this state of 
affairs, E. Biham proposed 9 new block modes and 2 new stream modes of opera- 
tion for DES. The complexities of attacking these new modes are conjectured to 
be at least 2^^^. The quadruple modes were conjectured to be more secure than 
any triple mode; furthermore, the complexity of attacking two of the quadruple 
modes was conjectured to be at least 2^^®. 

In 1998, D. Wagner analyzed E. Biham’s proposals [8]. Using the chosen- 
IV chosen text queries he broke them with the complexities lower than what E. 
Biham has conjectured. His method utilizes an equation for an exhaustive search 
for a key or looks for a collision for a birthday attack. Since E. Biham’s studies 
were premised on a more restrictive threat model that did not admit chosen-IV 
attacks, D. Wagner’s results do not disprove E. Biham’s conjectures but raise 
questions about the security of E. Biham’s proposed modes. 

D. Wagner’s assumption of chosen-IV is too unrealistic, so we use known- 
IV chosen texts more practical than chosen-IV to re-analyze the triple modes 
which E. Biham analyzed. Our attacks take their place between E. Biham’s 
and D. Wagner’s in terms of controlling I Vs. However, since for fixed IVs the 
birthday paradox is not available as D. Wagner, our attacks cannot break E. 
Biham’s proposals. Our results show how much the security of each triple mode 
decreases when the initial value is exposed. 

2 Preliminaries 

In this section, we describe something to understand our attack. Note that the 
underlying block cipher of every mode throughout this paper is DES with 64 bit 
plaintext and 56 bit key. 

We write Pq, Pi, - ■ ■ (or Cq, Ci, • • •, respectively) for the blocks of the plain- 
text (or ciphertext, respectively). We number the keys K\, K 2 , • • • and the initial 
values /Vi, /V 2 , • • • according to the order that the single-mode appears in the 
triple modes. The capital letters A,B, - ■ ■ are any fixed 64-bit values if no addi- 
tional explanations for them are given. 

D. Wagner chose initial values and plaintexts or ciphertexts to analyze the 
multiple modes which E. Biham has proposed. His method searches for some 
equations or collisions to apply to an exhaustive search or birthday attack. Since 
E. Biham’s multiple modes are very secure, we think that it is very hard for 
anyone to find a proper method to break them. However the assumption that 
the attacker can choose the initial value is too unrealistic. The assumption of 
known-IV is more practical than that of chosen-IV because the initial values 
require integrity rather than secrecy. 

When all initial values are known, every double mode is broken by a meet-in- 
the-middle attack. Furthermore, all except ECBjECB, CBCjCBC CBCjOFB, 
CBC|CFB-i, OFB|CBC-i, OFBjOFB, OFB|CFB-i, CF B|CBC-\ CFB|OFB, 
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CFB|CFB“^ are broken by only two exhaustive searches for two keys. We search 
equations which isolate one key or two keys to analyze any triple modes. In the 
initial cases, such a key value is recovered with a 2^® exhaustive key search and 
then remaining two keys are with a meet-in-the middle attack. In most of the 
latter cases, we can apply a meet-in-the-middle attack to the equation and then 
find the remaining key with the exhaustive key search. 

3 Known-IV Attacks 

We analyze 123 out of 216 triple modes. Complete knowledge of IVs is useful in 
breaking the triple modes which have the feedbacks driven into certain middle 
parts or arranged in a particular direction. However, it hardly helps the attacker 
who tries to find the keys of the triple modes with the feedbacks to spread 
forward and backward. 

A meet-in-the-middle attack for double ECB mode requires two known plain- 
texts. Using the one plaintext, we search some key candidates such that interme- 
diate values are equal. Despite having the wrong key, it may make intermediate 
values equal with the probability of 2“®^. We can find the right keys with a high 
probability by checking them for the other plaintext. 

The meet-in-the-middle attacks in our work also require two equations, two 
chosen plaintexts, or two chosen ciphertexts. Taking this into account, we choose 
the plaintexts or the ciphertexts; we classify the attacks according to the choice 
of the texts. 

3.1 AAB-Attack 

This method can break all triple modes in which the first two modes are 
ECBjECB. We describe the attack of ECB|ECB|CBC“^ as an example. We 
choose the plaintexts (A,A,B) and obtain the ciphertexts (Co, Ci, C 2 ). In Fig. 
I, the intermediate values after the first ECB component and the second ECB 
component are {A' , A , B') and (A",A",B"). Then the output of the third en- 
cryption box in the first block is equal to that in the second block. Therefore, 
we obtain the following equation. 

EkHIVs 0 Co) = m © Co © Cl 

So we may find by a 2®® exhaustive search, recognizing the right key value 
when the above equation holds. We expect no wrong key to survive the check 
with a high probability. 

Finally, K\ and K 2 can be recovered by the meet-in-the-middle attack with 
two plaintexts A and B. Consequently, we use 3 chosen plaintexts to break 
the ECB|ECB|CBC“^ mode, whereas E. Biham’s method requires 2®"* chosen 
plaintexts. 

3.2 AABB-Attack 

This method can break many triple modes in which the last two modes are 
CBCjECB. We describe the attack of OFB|CBC|ECB as an example. We 
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Fig. 1. Attack of ECB|ECB|CBC“^ 



choose the ciphertexts {A, A, B, B) and obtain the corresponding plaintexts 
(Po, Pi,P2, Ps)- In Fig- 2, the intermediate values entering the second encryption 
boxes are (A", A", P", P"), where A' = E^^JA),B' = E^l(B),A" = E^^JA'), 
and B" = P^^(P'). Therefore, we obtain the following equation for the first two 
blocks. 

Ek, (IVi) © Ek, {Ek, {IVi)) = IV2 © Po © Pi © E],l (A) 

Ki and P'3 are founded by a meet-in-the-middle attack. The right side of the 
above equation is computed for each of possible key values of P'3 and the result 
is kept in a table. Then the left side is computed under each of possible key 
values of PTi and checked whether the result appears in the table. If a pair of 
keys (PTi, PT3) satisfies both the above and the following equations, we conclude 
that they are the right keys for Pli and K3. 

EkAEkAEkAIVi))) © EkAEkAEkAEkAIVi)))) 

= P2 © P3 © E^\{A) © E^\{B) 

K2 is recovered by brute force. Consequently, we use 4 chosen ciphertexts to 
break the OFB|CBC|ECB mode, whereas E. Biham’s method requires 2 ®^ chosen 
ciphertexts. 



3.3 A A AB- Attack 

If the last two modes are a combination of ECB, CBC, or CFB, the triple mode is 
vulnerable to this attack. We will describe the application of this method to the 
CBC|CFB|ECB. To find the keys, we choose the ciphertexts (A,A,A, P), and 









212 



D. Hong et al. 




Fig. 2. Attack of OFB|CBC|ECB 



obtain the corresponding plaintexts {Pq, Pi, P2, Ps)- In Fig. 3 , the intermediate 
values after the first ECB component must be of the form (?, F, F, ?). Therefore, 
the intermediate value entering the first encryption box in the second block is 
equal to that in the third block. For all the possible values of Ki, we check the 
following equation. 

Ek^ {IVi © Po) © Ek, {Ek, {IVi ® Po) ® Pi) = Pi® P2 

Consequently, we use 4 chosen cipehrtexts to break the CBC|CFB|ECB, whereas 
E. Biham’s method requires 2 ^® chosen ciphertexts. 

3.4 AAA-Attack 

This attack can break all triple modes in which the first two mode is ECBjOFB. 
We describe explain the attack of the ECB|OFB|OFB mode as an example. 
We choose the plaintexts {A, A, A) and obtain the corresponding ciphertexts 
(Co, Cl, (72). The following equations are obtained from the fact that all of the 
intermediate values after the first ECB mode are equal. 

EKAIV 2 ) © EkAEkAIV2)) = Co © Cl © Ek^IV^) © Ek,{,Ek,{,IV^)) 

Ek, {IV2) © Ek, {Ek, {Ek, {IV2))) 

= Co © C2 © Ek, {IV3) © Ek, (Ek, (Ek, (IEs))) 

Then we can find K2 and by a meet-in-the-middle attack. Consequently, we 
use 3 chosen plaintexts to break the ECB|OFB|OFB mode, whereas E. Biham’s 
method requires 2 ®® chosen plaintexts. 
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3.5 JFJFA-Attack 

The main targets of this attack are the triple modes in which the last mode 
is CFB. We explain the attack of the OFB|CFB|CFB mode as an example. 
We choose the ciphertexts (/V3, IV3, A) and obtain the corresponding plaintexts 
{Pq, Pi, P2). In Fig. 5, the intermediate values after the second CFB component 
must be of the form {B, B, ?), where B = Then, the intermediate 

value of the input to the second encryption box in the second block is equal to 
that in the third block. We use the following equation to find by brute force. 

Ek, {Ek, (IVi)) 0 Ek, {Ek, {Ek, (IVi))) = Pi®P2®IV3®A 

Consequently, we use 3 chosen ciphertexts to break the OFB|CFB|CFB mode, 
whereas E. Biham’s method requires 2®® chosen cipehrtexts. 



IVi 



IV2 



IV3 




Fig. 5. Attack of OFB|CFB|CFB 



3.6 IVIVIV-Attack 

This method analyzes many triple modes in which the first mode is any single 
mode with a feedback and that the second mode is the OFB mode. We explain 
the attack of the CFB|OFB|CBC mode as a example. We choose the ciphertexts 
(/V3, IV3, IV3) and the corresponding plaintexts {Pq, Pi, ^2)- In Fig. 6, all of the 
intermediate values after the second OFB component are equal. We obtain the 
following equation. 
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EkAIVi) © Po © EKAIV2) = EkAEkAIVi) © Po) © Pi © EkAEkAIV^)) 

= Eki {Eki {Eki (IVi) © Po) © Pi) © P2 © Ek2 {Ek2 {Ek2 (1^2))) 

Then we can find Ki and P '2 by a meet-in-the-middle attack. Consequently, 
we use 3 chosen ciphertexts to break the CFB|OFB|CBC mode, whereas E. 
Biham’s method requires 2®® chosen ciphertexts. Furthermore, it takes 5 • 2®® 
encryption times with our attack until its three keys are found, whereas it takes 
2®® encryption times with E. Biham’s attack. 




Fig. 6. Attack of CFB|OFB|CBC 



3.7 IVIVIVA-Attack 

The triple CBC mode and similar modes are broken by this method. We describe 
the attack of the triple CBC mode as an example. We choose the ciphertexts 
(/Ea, /V 3 , /Vs, A) and obtain the corresponding plaintexts (Pq, Pi, P 2 , P 3 ). In 
Fig. 7, the intermediate values after the first CBC component must be of the 
form (?, P © P', P © P', ?), where B = IV3 ® E^^lAVs) and P' = E]^l(B). 
Therefore, the second and the third blocks in the values of the input to the first 
encryption boxes are equal. 

Ek, {IVi © Po) © Ek, {Ek, {IVi © Pq) © Pi) = Pi © P 2 

We may find Ki by a 2®® exhaustive keysearch, recognizing the right key value 
when the above equation holds. Consequently, we use 4 chosen ciphertexts to 
break the CBC | CBC | CBC mode, whereas E. Biham’s method requires 2®^ chosen 
ciphertexts. 
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Fig. 7. Attack of CBC|CBC|CBC 



3.8 IV IV A A A- Attack 

The ECB|CBC|CFB mode and the CFB|CBC|CFB mode are broken by this 
method. To find the keys of the FCB|CBC|CFB mode, we choose the ciphertexts 
{IV3,IV3,A,A,A) and obtain the corresponding plaintexts {Pq, Pi, P2, P3, Pa)- 
In Fig. 8, the intermediate values entering the second encryption boxes must 
be of the form where B = E~^^{Ek3{IV3) 0 /V3) and F = 

E]^^{Ek3{A) 0 A). Therefore, for the first and the second blocks, we obtain 
the following equation. 

Ek, (Po) 0 Ek, (Pi) = IV2 0 IV3 0 Ek, (/F3) 

By a meet-in-the middle attack, we find a few candidates of a pair of {Ki,K^) 
from the above equation. If a candidate satisfies the following equation which 
we obtain for the fourth and fifth blocks, we are sure that it is the right value 
of {Ki,K:i). 

^KiiPs) ® Eki(Pa) = Ex^ilVs) 0 Ex^iA) 

Then K2 is recovered by an exhaustive search. Consequently, we use 5 chosen 
ciphertexts to break the ECB|CBC|CFB mode, whereas E. Biham’s method 
requires 2^^ chosen ciphertexts. 

3.9 IV IV A A AB- Attack 

We only apply this method to the ECB|CFB|CBC mode. To find its keys of it, 
we choose the ciphertexts {IV3, IV3, A, A, A, B) and obtain the corresponding 
plaintexts {Pq, Pi, P2, P3, P4, P5). In Fig. 9, the intermediate values after the 
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Fig. 8. Attack of ECB|CBC|CFB 



second CFB component must be of the form {F, F, ?, G, G, ?), where F = IV 3 0 
EkI{IV 3 ), and G = A® E^^{A). Therefore, for the second and third blocks, we 
obtain the following equation. 

EkAPi) ® EkAP 2 ) = E],1{IV3) 0 E],l{A) 

By a meet-in-the middle attack, we find a few candidates of a pair of 
from the above equation. If a candidate satisfies the following equation which 
we obtain for the fourth and fifth blocks, we are sure that it is the right value 
of 

Ek, {Pi) 0 Ek, (P 5 ) = E],l{A) 0 E^\{B) 

Then K 2 is recovered by an exhaustive search. Consequently, we use 6 chosen 
ciphertexts to break the ECB|CFB|CBC mode, whereas E. Biham’s method 
requires 2^^ chosen ciphertexts. 

4 Conclusion 

In this paper, we have presented the attacks to break many triple modes of 
operation with known-IV chosen plaintexts or chosen ciphertexts. Our results 
require fewer texts in cryptanalysis of triple modes than E. Biham’s. We have 
analyzed 123 among 216 triple modes of operation. If the initial values are known, 
the triple modes which have the feedbacks driven into certain middle parts or 
arranged in a direction may be much weaker than under E. Biham’s assumption. 
They are broken with about 3-4 chosen plaintexts or ciphertexts, 2®® encryptions, 
and 2®® memories. However, we could not find the proper method to attack the 
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m 



IV3 




Fig. 9. Attack of ECB|CFB|CBC 

others when trying to find the keys of the triple modes which have the feedbacks 
to spread forward and backward. We leave the problem of such triple modes 
open. 



References 

1 . E. Biham. Cryptanalysis of multiple modes of operation. Journal of Cryptology, 
011:45-58, 1998. 

2. E. Biham. Cryptanalysis of triple modes of operation. Journal of Cryptology, 
012:161-184, 1999. 

3. Eli Biham and Adi Shamir. Differential cryptanalysis of DES-like cryptosystems. 
J. Cryptology, 4(l):3-72, 1991. 

4. Eli Biham and Adi Shamir. Differential cryptanalysis of the full 16-round DES. In 
E. F. Brickell, editor. Advances in Cryptology - CRYPTO’92, volume 740 of Lecture 
Note in Computer Science, pages 487-496. Springer- Verlag, 1993. 

5. Mitsuru Matsui. Linear cryptanalysis method for DES cipher. In T. Helleseth, 
editor. Advances in Cryptology - EUROCRYPT’93, volume 765 of Lecture Notes in 
Computer Science, pages 386-397. Springer- Verlag, 1994. 

6. Mitsuru Matsui. On correlation between the order of s-boxes and the strength of 
DES. In Advances in Cryptology - EUROCRYPT’94, volume 950 of Lecture Notes 
in Computer Scienee, pages 366-375. Springer- Verlag, 1995. 

7. National Bureau of Standards. Data Encryption Standard. FIPS Pub. 46, 1977. 

8. D. Wagner. Cryptanalysis of some recently-proposed multiple modes of operation. 
In FSE’98, volume 1372 of Lecture Notes in Computer Science, 1998. 



Appendix 

In this appendix we list our result. We follow Biham’s notation of the complexity, 
which consists of three parameters: the number of plaintexts/the number of steps 
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of the attack(the time of encryptions) /the required memory size. ‘Biham’ is the 
result in [2] corresponding to ours. We compute some E. Biham’s complexities 
in detail when the differences between our results and them are relatively small. 



Table 1. AAB-attack 



Mode 


Complexity 


Biham 


Inverse 


ECB|ECB|CBC 


3/4 • 


2®®/2®“/2®® 


CBC"^|ECB|ECB 


ECB ECB CBC"^ 


3/3 • 2®®/2®® 


264/258/2®6 


CBC|ECB|ECB 


ECB ECB OFB 


3/4 • 2®®/2®® 


264/258/2®6 


OFB ECB ECB 


ECB ECB CEB 


3/4 • 2®®/2®® 


233/258/2®6 


CFB-^^IECBIECB 


ECB ECB CFB-^^ 


3/4 • 2®®/2®® 


264/258/2®6 


CFB|ECB|ECB 



Table 2. AABB-attack 


Mode 


Complexity 


Biham 


Inverse 


ECB|CBC|ECB 


4/4 • 2®®/2®® 


5/2®7- 


ECB|CBC"^|ECB 


CBC-^^ICBCIECB 


4/4- 2®®/2®® 


5/2®V- 


ECB CBC"^ CBC 


CBC|CBC|ECB 


4/4- 2®®/2®® 


234/2®9/233 


ECB CBC“^ CBC"^^ 


OFB CBC ECB 


4/4- 2®®/2®® 


264/5 , 2S6/_ 


ECB CBC"^ OFB 


CFB-i|CBC|ECB 


4/4- 2®®/2®® 


5/2®V- 


ECB CBC"^ CEB 


CFB|CBC|ECB 


4/4- 2®®/2®® 


236/2®9/233 


ECB CBC"^ CFB-^^ 



Table 3. AAAB-attack 


Mode 


Complexity 


Biham 


Inverse 


ECB|CBC|CBC 

CBC”^|CBC|ECB 


4/4 


260/266 


2®®/2®”/2®® 


CBC""|CBC“^|ECB 


4/4 


2®6/256 


5/2®V- 


ECB|CBC"^|CBC 


CBC|CBC|ECB 


4/4 


2®6/256 


234/2®9/233 


ECB CBC"^ CBC"^^ 


CFB-^^ICBCIECB 
CBC"^|CFB ECB 


4/4 


2®6/2®6 


5/2®V- 


ECB CBC"^ CEB 


4/4 


2®6/256 


4/5 • 2®®/- 


ECB CFB-i|CBC 


CBC|CFB|ECB 


4/4 


2®6/256 


236/2®9/233 


ECB CFB-i CBC"^^ 


OFB|CFB ECB 


4/5 


2®6/256 


264/5 , 2S6/_ 


ECB CFB-i OFB 


CFB-^^ICFBIECB 


4/4 


2®6/256 


4/5 • 2®®/- 


ECB CFB-i CEB 


CFB|CFB|ECB 

CBC-i|CBC|CBC 


4/5 


2®6/256 


234/2®9/233 


ECB CFB-i CFB-i 


4/4 


2®6/256 


234/2®9/233 


CBC-^ICBC-^^ICBC 


OFB|CBC|CBC 


4/5 


2®6/256 


266/2®®/- 


CBC"^ CBC-^^ OFB 


CFB-^^ICBCICBC 


4/4 


2®6/256 


234/2®9/233 


CBC"^ CBC-^^ CEB 
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Table 4. HHT-attack 



Mode 


Complexity 


Biham 


Inverse 


ECB|OFB|ECB 


3/5 • 2'^“/2‘^“ 


2«4/5 . 2‘='^/2®® 


itself 


ECB pFB CBC 


3/5 • 2®®/2®® 


264/5 , 256/2®® 


CBC“^|OFB|ECB 


ECBpFB CBC"^ 


3/5 • 2®®/2®® 


2®®/2®V- 


CBC|OFB|ECB 


ECBpFBpFB 


3/5 • 2®®/2®® 


265/265/265 


OFBpFB|ECB 


ECBpFB CFB 


3/5 • 2®®/2®® 


264/5 , 256/2®® 


CFB-i|OFB|ECB 


ECBpFB CFB-i 


3/5 • 2®®/2®® 


2®®/2®V- 


CFB|OFB|ECB 



Table 5. 7E/EH-attack 


Mode 


Complexity 


Biham 


Inverse 


ECB 1 CFB 1 CFB 


3/4- 


25b /2^b 


2®4/25y/24a 


CFB" 


■^|CFB“^|ECB 


CBC|ECB|CBC 


3/3 • 


256/256 


234 /2®9/233 


CBC" 


■^|ECB|CBC“^ 


CBC ECB CFB 


3/3 • 


256/2®® 


234 /2®9/233 


CFB" 


i|ECB|CBC-^ 


OFB|ECB|CBC 


3/4- 


256/2®® 


2®4/g , 256/_ 


CBC" 


■^|ECB|OFB 


CBC“^|ECB|CBC 


3/4- 


256/2®® 


4/5 • 2®®/- 


itself 




CBC“^ ECB CFB 


3/4- 


256/2®® 


4/5 • 2®®/- 


CFB" 


■i|ECB|CBC 


CFB|ECB|CBC 


3/4- 


256/2®® 


234 /2®9/233 


CBC" 


■^|ECB|CFB“^ 


CBC“^|CFB|CFB 


3/4- 


256/2®® 


234 /2®9/233 


CFB" 


■^|CFB-^|CBC 


OFB|ECB|CFB 


3/4- 


256/2®® 


2®4/5 , 256/_ 


CFB" 


ECB|OFB 


OFB CFB|CFB 


3/5 • 


256/2®® 


2®6/2®9/_ 


CFB" 


CFB“i|OFB 


CFB 1 ECB 1 CFB 


3/4- 


256/2®® 


234 /2®9/233 


CFB" 


ECB|CFB-i 


CFB CFB 1 CFB 
CFB“i|CFB|CFB 
CFB - 1 ECB 1 CFB 


3/5 • 


256/2®® 


234 /2®0/233 


CFB" 


CFB-^|CFB-^ 


3/4- 


256/2®® 


234 /2®9/233 


CFB" 


CFB-i CFB 


3/4- 


256/2®® 


4/5 • 2®®/- 


itself 





Table 6 . JE/E/E-attack 


Mode 


Complexity 


Biham 


Inverse 


CBC|OFB|CBC 


3/5- 2®®/2®6 


266 / 2 ®®/- 


CBC""|OFB|CBC"" 


CBCpFB CFB 


3/5- 2®6/2®6 


266 / 2 ®®/- 


CFB""i|OFB|CBC"^ 


CBC“^|OFB|CBC 


3/5- 2®®/2®6 


266 / 5 , 256/_ 


itself 


OFB|OFB|CBC 


3/5- 2®®/2®6 


265/265/2®® 


CBC"^|OFB|OFB 


CBC“^|OFB|CFB 


3/5- 2®®/2®6 


266 / 5 , 256/_ 


CFB-i|OFB|CBC 


CFB|OFB|CBC 


3/5- 2®6/2®6 


266 / 2 ®®/- 


CBC"^|OFB|CFB-i 


OFB|OFB|CFB 


3/5- 2®6/2®6 


265/265/2®® 


CFB'i|OFB|OFB 


CFB|OFB|CFB 

CFB-i|OFB|CFB 


3/5- 2®6/2®6 


266 / 266 /- 


CFB~ipFB CFB-i 


3/5- 2®®/2®6 


266 / 5 , 2 S 6 /_ 


itself 
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Table 7. IV IV IV A-aXtack 



Mode 


Complexity 


Biham 


Inverse 


CBC|CBC|CBC 


4/4 


200/230 


254/2»'J/2^^ 


CBC""|CBC“^|CBC“" 


CBC CBC CFB 


4/4 


256/256 


234/260/233 


CFB“i|CBC"^|CBC“^ 


CBC CFB|CBC 


4/4 


256/256 


234/260/233 


CBC-i|CFB-i CBC-i 


CBC CFB CFB 


4/4 


256/256 


234/260/233 


CFB-i|CFB-i|CBC"^ 


CBC"^|CBC|CFB 


4/4 


256/256 


5/5 • 2®®/2®® 


CFB-i CBC"^|CBC 


CFB|CBC|CBC 


4/4 


256/256 


234/260/233 


CBC-i|CBC-i|CFB-i 


CBC"^|CFB|CBC 


4/4 


256/256 


5/5 • 2®®/- 


CBC"^ CFB-i|CBC 


OFB|CFB|CBC 


4/5 


256/256 


266/2®®/- 


CBC"^ CFB-i pFB 


CFB-i|CFB|CBC 


4/4 


256/256 


5/5 • 2®®/- 


CBC-i CFB-i CFB 


CFB|CFB|CBC 


4/4 


256/256 


234/260/233 


CBC"^ CFB-i CFB-i 


OFB|CBC|CFB 


4/5 


256/256 


266/26®/- 


CFB-i|CBC"^pFB 


CFB~i|CBC|CFB 


4/4 


256/256 


5/5 • 2®®/- 


CFB-i CBC-i CFB 



Table 8. IV IV AAA-attack 



Mode 


Complexity 


Biham 


Inverse 


ECB|CBC|CFB 


5/4- 2®6/2®® 


2®^/2®®/2®® 


CFB-"|CBC“"|ECB 


CFB|CBC|CFB 


5/4- 2®®/2®® 


234 / 260/233 


CFB-i CBC“^ CFB~i 



Table 9. IV IV AAAB-attack 



Mode 


Complexity 


Biham 


Inverse 


CBC|CFB|CBC 


6/5 • 2®®/2®® 


2®^/2®®/2®® 


CBC“"|CFB-"|ECB 
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Abstract. Let A be a Feistel scheme with 5 rounds from 2n bits to 2n 
bits. In the present paper we show that for most such schemes A: 

1. It is possible to distinguish A from a random permutation from 2n 
bits to 2n bits after doing at most (P(2 t ) computations with 0{2^ ) 
random plaintext /ciphertext pairs. 

2. It is possible to distinguish A from a random permutation from 2n 
bits to 2n bits after doing at most 0(2 ~^ ) computations with 0(2~^ ) 
chosen plaintexts. 

Since the complexities are smaller than the number 2^" of possible in- 
puts, they show that some generic attacks always exist on Feistel schemes 
with 5 rounds. Therefore we recommend in Cryptography to use Feistel 
schemes with at least 6 rounds in the design of pseudo-random permu- 
tations. 

We will also show in this paper that it is possible to distinguish most of 6 
round Feistel permutations generator from a truly random permutation 
generator by using a few (i.e. 0{1)) permutations of the generator and 
by using a total number of 0(2^") queries and a total of 0(2^") compu- 
tations. This result is not really useful to attack a single 6 round Feistel 
permutation, but it shows that when we have to generate several pseudo- 
random permutations on a small number of bits we recommend to use 
more than 6 rounds. We also show that it is also possible to extend these 
results to any number of rounds, however with an even larger complexity. 

Keywords: Feistel permutations, pseudo-random permutations, generic 
attacks on encryption schemes, Luby-Rackoff theory. 



1 Introduction 

Many secret key algorithms used in cryptography are Feistel schemes (a precise 
definition of a Feistel scheme is given in section 2), for example DES, TDES, 
many AES candidates, etc.. In order to be as fast as possible, it is interesting to 
have not too many rounds. However, for security reasons it is important to have 
a sufficient number of rounds. Generally, when a Feistel scheme is designed for 
cryptography, the designer either uses many (say > 16 as in DES) very simple 
rounds, or uses very few (for example 8 as in DFC) more complex rounds. A 
natural question is: what is the minimum number of rounds required in a Feistel 
scheme to avoid all the “generic attacks” , i.e. all the attacks effective against 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 222-238, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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most of the schemes, and with a complexity negligible compared with a search 
on all the possible inputs of the permutation. 

Let assume that we have a permutation from 2n bits to 2n bits. Then a 
generic attack will be an attack with a complexity negligible compared to 0(2^”), 
since there are 2^" possible inputs on 2n bits. 

It is easy to see that for a Feistel scheme with only one round there is a 
generic attack with only 1 query of the permutation and 0(1) computations: 
just check if the first half (n bits) of the output are equal to the second half of 
the input. 

In [4] it was shown that for a Feistel scheme with two rounds there is also 
a generic attack with a complexity of 0{1) chosen inputs (or 0(2 2 ) random 
inputs). 

Also in [4], M. Luby and C. Rackoff have shown their famous result: for more 
than 3 rounds all generic attacks on Feistel schemes require at least 0(2^) 
inputs, even for chosen inputs. If we call a Luby-Rackoff construction (a.k.a. L- 
R construction) a Feistel scheme instantiated with pseudo-random functions, this 
result says that the Luby-Rackoff construction with 3 rounds is a pseudorandom 
permutation. 

Moreover for 4 rounds all the generic attacks on Feistel schemes require at 
least 0 ( 22 ) inputs, even for a stronger attack that combines chosen inputs and 
chosen outputs (see [4] and a proof in [6], that shows that the Luby-Rackoff 
construction with 4 rounds is super-pseudorandom, a.k.a strong pseudorandom). 
However it was discovered in [7] (and independently in [1]) that these lower 
bounds on 3 and 4 rounds are tight, i.e. there exist a generic attack on all Feistel 
schemes with 3 or 4 rounds with 0(2 2 ) chosen inputs with 0(2^) computations. 

For 5 rounds or more the question remained open. In [7] it was proved that 
for 5 rounds (or more) the number of queries must be at least 0(2 ^ ) (even with 
unbounded computation complexity), and in [8] it was shown that for 6 rounds 
(or more) the number of queries must be at least 0(2 t ) (even with unbounded 
computations) . 

It can be noticed (see [7]) that if we have access to unbounded computations, 
then we can make an exhaustive search on all the possible round functions of 
the Feistel scheme, and this will give an attack with only 0(2”) queries (see [7]) 
but a gigantic complexity > 0(2"^ ). This “exhaustive search” attack always 
exists, but since the complexity is far much larger than the exhaustive search 
on plaintexts in 0(2^"), it was still an open problem to know if generic attacks, 
with a complexity <C 0(2^”), exist on 5 rounds (or more) of Feistel schemes. 

In this paper we will indeed show that there exist generic attacks on 5 rounds 
of the Feistel scheme, with a complexity <C 0(2^"). We describe two attacks on 
5 round Feistel schemes: 

1. An attack with 0(2t) computations on 0(2t) random input/output 

pairs. 

2. An attack with 0(2t ) computations on 0(2t ) chosen inputs. 

For 6 rounds (or more) the problem remains open. In this paper we will describe 
some attacks on 6 rounds (or more) with a complexity much smaller than 0(2"^ ) 
of exhaustive search, but still > 0(2^”). So these attacks on 6 rounds and more 
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are generally not interesting against a single permutation. However they may be 
useful when several permutations are used, i.e. they will be able to distinguish 
some permutation generators. These attacks show for example that when several 
small permutations must be generated (for example in the Graph Isomorphism 
scheme, or as in the Permuted Kernel scheme) then we must not use a 6 round 
Feistel construction. 

Remark. The generic attacks presented here for 3, 4 and 5 rounds are effective 
against most Feistel schemes, or when the round functions are randomly cho- 
sen. However it can occur that for specific choices of the round function, the 
attacks, performed exactly as described, may fail. However in this case, very 
often there are modified attacks on these specific round functions. This point 
will be discussed in section 6. 

2 Notations 

We use the following notations that are very similar to those used in [4] , [5] and 

[ 8 ]. 

— /„ = {0, 1}" is the set of the 2” binary strings of length n. 

— For a,b G In, [a, b] will be the string of length 2n of / 2 „ which is the con- 
catenation of a and b. 

— For a,b G In, a (B b stands for bit by bit exclusive or of a and b. 

— o is the composition of functions. 

— The set of all functions from /„ to In is Fn- Thus |F„| = 2”'^". 

— The set of all permutations from /„ to is Thus C and |i?„| = 
(2”)! 

— Let /i be a function of F„. Let L, R, S and T be elements of Then by 
definition 

r 5 = i? 

F{fi)[L,R] = [S,T] U \ and 

[T = L(Bfi{R) 

— Let fi, f 2 , ■ ■ ■ , fk be k functions of F„. Then by definition: 

...,fk)= nfk) o • • • o F{h) O .F(/i). 

The permutation !F*(/i, . . . , fk) is called “a Feistel scheme with k rounds” 
and also called 

3 Generic Attacks on 1,2,3, and 4 Rounds 

Up till now, generic attacks had been discovered for Feistel schemes with 1,2, 3, 4 
rounds. Let us shortly describe these attacks. 

Let / be a permutation of i? 2 „. For a value [Li,Ri] G Rn we will denote by 
[Si,R] = f[Li,R,]. 
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1 round 

The attack just tests if = i?i. If / is a Feistel scheme with 1 round, this will 
happen with 100% probability, and if / is a random permutation with probability 
~ So with one round there is a generic attack with only 1 random query and 
0(1) computations. 

2 rounds 

Let choose i ?2 = Ri and L 2 Li. Then the attack just tests if S'i0S'2 = Li0i2- 
This will occur with 100% probability if / is a Feistel scheme with 2 rounds, and 
if / is a random permutation with probability ~ So with two rounds there 
is a generic attack with only 2 chosen queries and 0(1) computations. 

Note 1: It is possible to transform this chosen plaintext attack in a known 
plaintext attack like the following. If we have 0(22) random inputs \Li,Ri], 
then with a good probability we will have a collision Ri = Rj,i yf j. Then we 
test if Si 0 Sj = Li (B Lj. Now the attack requires 0(2 5) random queries and 
0(2^) computations. 

Note 2: This attack on 1 and 2 rounds was already described in [4] . 

3 rounds 

Let 4> be the following algorithm : 

1. (j) chooses TO distinct Ri,l < i < m, and chooses Lj = 0 (or Lj constant) for 
alH, 1 < t < TO. 

2. (j) asks for the values [5'j, Tj] = /[Lj, i?j], 1 < i < to. 

3. (j) counts the number N of equalities of the form iij 0 5'j = Rj (B Sj,i < j. 

4. Let Nq be the expected value of N when / is a random permutation, and Ni 

be the expected value of N when / is a f 2 , /s), with randomly chosen 

fs- 

Then ~ 2Nq, because when / is a V’^(/i, / 2 , /s), Ri ® Si = /2(/i(i?i)) 
so / 2 (/i(i?*)) = f 2 {fi{Rj)),i < j, if fi{Rt) -h fi{Rj) and / 2 (/i(i?*)) = 
f2{fl{Rj)) OT if fi(R^) = fi{Rj). 

So by counting N we will obtain a way to distinguish 3 round Feistel permu- 
tations from random permutations. This generic attack requires 0(2 2 ) chosen 
queries and 0 ( 22 ) computations (just store the values Ri 0 Si and count the 
collisions) . 

Remark. Here Ni ~ 2 • Nq when /i,/ 2,/3 are randomly chosen. Therefore this 
attack is effective on most of 3 round Feistel schemes but not necessarily on all 

3 round Feistel schemes. (See section 6 for more comments on this point). 

4 rounds 

This time, we take Ri = 0 (or i?j constant), and we count the number N of 
equalities of the form 5'j0Lj = Sj(BLj, i < j. In fact, when f = f 2 , fa, f 4 ), 

then S'i 0 Lj = / 3 (/ 2 (Lj 0 /i(O))) 0 /i(O). So the probability of such an equality is 
about the double in this case (as long as /i, / 2 , fa are randomly chosen) than in 
the case where / is a random permutation (because if / 2 (Lj 0 /i( 0 )) = f 2 {Lj 0 
/i( 0 )) this equality holds, and if (3i = / 2 (L* 0 /i( 0 )) yf / 2 (Ly 0 /i( 0 )) = /3j but 
faiPi) = faiPj), this equality also holds). 
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So by counting N we will obtain a way to distinguish 4 round Feistel permu- 
tations from random permutations. This generic attack requires 0{2^) chosen 
queries and 0{2^) computations (just store the values Si 0 Li and count the 
collisions) . 

Notes: 

1. These attacks for 3 and 4 rounds have been first published in [7], and inde- 
pendently re-discovered in [1]. 

2. Here again the attack is effective against most of 4 round Feistel schemes 
but not necessarily on all 4 round Feistel schemes. (See section 6 for more 
comments on this point). 



4 A Generic Attack on 5 Round Feistel Permutations 
with 0 { 2 ^) Random Plaintexts and 0 { 2 ^) 
Complexity 

4.1 Notations for 5 Round Feistel Permutations 

Let i be an integer. For any given i, let [Li,Ri] be a string of 2n bits in /2„. Let 

= [S,,R]. 

We introduce the intermediate variables Xi,Pi and Yi such that: 

r X, = L,®h{Ri) 

{ Pi = Ri® f2{X,) 

[ r, = Xi 0 h{Pi) 

So we have: Si = Pi® /4(F)) and Tj = Fj 0 f^{Si). In other terms we have 
the following: 

'^{fi)[Li, Ri\ = [Ri, Xi\, as Xi = Li ® fi{Ri) 

(/2)[i?i, A] = [^i, p^], as Pi = R. 0 /2(A) 

W{f 3 )[Xi,R] = [R,Y^, as Y, = X,® h{Pi) 

<F(/4)[P„ r.] = [F„ 5.], as S, = Pi® fiiYi) 

'f(/5)r^,^^] = [S^,R], as Ti = Yi® /s(5d 



Input : I L I R 
1 round: I R | X 

2 rounds: | X | P 

3 rounds: | P | Y 

4 rounds: | Y | S 

Output, 5 rounds: f S | T 



Fig. 1. 





Generic Attacks on Feistel Schemes 



227 



We may notice that the following conditions (C) are always satisfied: 



(C) 



Ri — Rj -Xi 0 Li — Xj 0 Lj (CR) 
X* = Xj ^ 0 F, = Rj 0 Pj (CX) 

P, = Pj ^ Xi (BY, = Xj 0 r, (CP) 

Y, = Yj ^ Si® P, = Sj 0 Pj (CY) 

S, = Sj ^ Yi®T, = Yj 0 Tj (CS) 



4.2 The Attack 

Let / be a permutation from B 2 n We want to know (with a good probability) 
if / is a random element of B 2 n, or if / is a Feistel scheme with 5 rounds (i.e. 
/ = /2, fs, U, /s) with /i, /2, /s, U, h being 5 functions of F„). 

The attack proceeds as follows: 

Step 1: We generate m values [Si,Ti] = f[Li,Ri], 1 < i < m such that the 
[Li,Ri] values are randomly chosen in /2„ and with m = 0(2~). 

Step 2: We look if among these values, we can find 4 pairwise distinct indices 
denoted by 1,2, 3, 4 such that the following 8 equations (and 2 inequalities) are 
satisfied: 

' i?i = i?3 
i?2 = i?4 

Li © L3 = L 2 © ©4 




S'! © S'2 = i?l © i?2 
Ti © T3 = Li © L3 
. Ti © T3 = T2 © T4 
(and with i?i yf R2 and Li yf L3) 



1 , 2 



3 4 



SQR 



and Li © L 2 © L3 © Z/4 — 0 



R; 5, L © T «, 5, L © T 



Fig. 2. A representation of the 8 equations ^ in L, S, R, T. 



Below we explain how one can test with the complexity of 0(m) if such 
indices exist. 
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Step 3: If such indices exist, we will guess that / is Feistel scheme with 5 rounds. 
If not we will say that / is not a Feistel scheme. [We will see below that the 
probability to find such indices is not negligible if / is a Feistel scheme with 5 
rounds and M > 0 ( 2 t ) for most of 5 round Feistel schemes]. 



4.3 How to Accomplish the Step 2 in 0{m) Computations 

First, we find among the m x m possibilities, all the possible indices 1 and 3 
such that: 



( — i?3 

I Li 0 Ti = 0 T3 

It is possible to this in 0{m) computations instead of O(m^) by storing all 
the m values (i?^, S'i, Fj 0 Ti) in a hash table and looking for collisions. We expect 
to find m such indices (as m <C 2 ^”). 

In the same way we find all the possible indices 2 and 4 such that: 

( i?2 = Ri 

\S 2 = 84 

I L2 0 F2 = F4 0 F4 

Each part requires 0{m) computations and 0{m) of memory, and, if needed, 

there is a tradeoff with 0 {m ■ a) computations and 0 (rnj a) memory. 

Now we store all the values (Li 0 Ls,Si 0 i?i) for all the indices (1,3) 

2 

already found. There are about < w such values. Then we store all the 
values (L 2 0 L 4 , S 2 0 R 2 ) for all the indices (2,4) already found. Using another 
birthday paradox technique, we look for the following collision: 

/ L2 0 L4 = Li (B T3 
( S '2 0 i ?2 = 0 i?i 



The complexity and the storage is 0{^) < 0{m) again. At the end we have 
at most m choices of pairwise distinct indices (1,2, 3, 4). Among these we keep 
those that give i?i p i ?2 and Li P T 3 . By inspection we check that now they 
satisfy all the equations of (#). 



4.4 Probability of (#) When / Is a Random Permutation of B^n 

When / is a random permutation of i? 2 n, we have 0{rn^) possibilities to chose the 
indices 1 , 2 , 3, 4 among the m possible indices, and we have 8 equations to satisfy, 
with a probability about to have them all true for some pairwise distinct 
1,2, 3,4. By inspection we check that the equations of (#) are not dependent. 
Thus the probability to have 4 pairwise distinct indices 1, 2, 3, 4 that satisfy (p) 

4 

is about when / is a random permutation of i? 2 n (n.b. the two additional 
inequalities R\ p R 2 and Li p L 3 change nothing). Since m <C 2^” (because 
m = 0(2 t)) this probability is negligible. 
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4.5 Probability of (#) When / Is a Feistel Scheme with 5 Rounds 

Theorem 1 When f is a Feistel scheme with 5 rounds, the 8 equations of {ff) 
are a logical consequence on the following 7 equations: 



(Ri^Rs ( 1 ) 

R2 = i?4 (2) 

Li 0 = L 2 © T4 (3) 

(A) { Si = 83 (4) 

Ai = A 2 (5) 

Pi = P 3 (6) 

IYi = Y 2 (7) 



Proof of Theorem 1. 

We will use the facts (CR), (CX), (CP), (CY) and (CS) that have been intro- 
duced in section 4.1. 

— From (1) and (CR) we get 

A3 = X\ © © F3 (8) 

— From (2) and (CR) we get A4 © L4 = A2 © L 2 , and then using (8), (5) and 
(3) we get 

A4 = A3 (9). 

— From (5) and (CX) we get: 

Ri® Pi= R 2 ® P 2 (10) 

— From (9) and (CX) we get i?4 © P4 = i?3 © P3 and then from (10), (6), (1) 
and (2) we get: 

Pa = P 2 (11) 

— From (6) and (CP) we get Ai © Yi = A3 © ©3 and then from (8) we get: 

A3 = Ai®Li®L3 ( 12) 

— From (11) and (CP) we get A2 © Y2 = A4 © I4 and then from (12), (7), (9), 

(5) and (8) we get: 

A4 = F3 (13) 

— From (7) and (CY) we get © Pi = S'2 © P2 and then from (10) we get: 

5*1 © S'2 = Pi © P2 (14) 

— From (13) and (CY) we get S 4 ®P 4 = 53 ©P3 and then from (14), (4), (11), 

(6) and (10) we get: 

84 = 82 (15) 

— From (4) and (CS) we get Yi © Ti = Y3 © and then from (12) we get: 

T^ = Ti ® Li ® P3 (16). 

— From (15) and (CS) we get I4 © T4 = Y2 © Y2 and then from (13), (7), (12) 
and (16) we get: 

T 4 ® T 2 = Ti ® T3 (17) 

— If Pi = P2 then because of (5) we have Li = L 2 and Pi = P2 1 = 2 and 
the indices 1 and 2 are distinct by definition. Thus 

Pi^P2 (18) 

— Finally since 1 yf 3 and because of (1) we have. Pi yf P3 (19) 
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So all the equations of (#) are indeed just consequences of the 7 equations 
(yl) when / is a Feistel with 5 rounds. Indeed the 8 + 2 conditions of (#) are 
now in (1), (2), (3), (4), (15), (14), (16), (17), and finally (18) and (19). 

Theorem 2 Let f be a Feistel scheme with 5 rounds, f = f 2 , fs, fi, fs) ■ 

Then for most of such f, the probability to have 4 pairwise distinct indices 1,2, 3, 4 
that satisfy ff is > O(^), and thus is not negligible when m > 0(2t ). There- 
fore the algorithm given in the section 4 is indeed a generic way to distinguish 
most Feistel schemes with 5 rounds from a truly random permutation of i? 2 n 
with a complexity of 0(2^). 

Proof. 

When /i, / 2 ) /sj / r, /s are randomly chosen in the probability that there 
exist pairwise distinct indices 1,2, 3,4 chosen out of a set of m indices such that 

4 

all the 7 equations (A) hold is = 0{^). Thus from the Theorem 1 we get the 
Theorem 2. 

Remark. Here again, the attack is effective against most of 5 round Feistel 
schemes, but not necessarily on all 5 round Feistel schemes. (See section 6 for 
more comments on that). 



5 A Generic Attack on 5 Round Feistel Permutations 
with 0 ( 2 ^) Chosen Plaintexts and 0 ( 2 ^) Complexity 

This attack proceeds exactly as the previous attack of the Section 4, except that 
now Step 1 is replaced by the following Step’ 1: 

Step’ 1. We generate m values f[Li,Ri] = [Si,Ti], 1 < i < m such that the Li 
values are randomly chosen in In and the Ri values are randomly chosen in a 
subset I'n of In with only 2 2 elements. For example /4=all the strings of n bits 
with the first n/2 bits at 0. 

Let m = 0{2^). 



5.1 Probability of (#) When / Is a Random Permntation of B- 2 n 

Now the probability that there are some indices 1, 2, 3, 4 such that equations fff) 
are satisfied when / is randomly chosen in B 2 n is about 

4 4 

m m 

2t • 2726" “ ^ 

(because the equations R\ = i ?3 and i ?2 = Ra have now a probability ^ to 
be satisfied instead of ^). 

However, since here m = 0(2^), this probability ^ is still negligible. 
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5.2 Probability of (#) When / Is a Feistel Scheme with 5 Rounds 

When / is a Feistel scheme with 5 rounds, with /i, / 2 , /sj / 4 > /s randomly chosen 
in Fn, the probability that there exist indices 1, 2, 3,4 chosen out of a set of m 
indices, such that all the 7 equations (rl) are satisfied is about 

4 4 

m m 

~ 2t . 2t25" ~ 2*3" 

(because the equations Ri = R3 and R2 = R4 have now a probability to be 
satisfied instead of ^). 

So from Theorem 1 of section 4, we see that for these functions / the prob- 
ability that there exist indices 1,2, 3, 4 such that all the 8 equations (and 2 in- 

4 

equalities) # are satisfied is here generally > 0{^). 

Thus the algorithm given in this section 5 is indeed a generic way to distin- 
guish most Feistel schemes with 5 rounds from a truly random permutation of 
B2n, with a complexity 0 {2^) and 0 {2~^) chosen queries. 

Remark. Here again some time/memory tradeoff is possible: use 0{2^) chosen 
queries, 0{2^ ■ a) computations and 0{2^ ja) of memory. 

6 Feistel Schemes with Specific Round Functions 

The problem. The generic attacks that we have presented for 3, 4 and 5 rounds 
are effective against most Feistel schemes, or when the round functions are ran- 
domly chosen. However it can occur that for specific choices of the round func- 
tions, these attacks, if applied exactly as described, may fail. In this cases, very 
often there are some other attacks, against these specific rounds functions, that 
are even simpler. We will illustrate this on an example pointed out by an anony- 
mous referee of Asiacrypt’2001. 

Theorem 3 (Knudsen, see [2] or [3]) Let [Li, Ri] and[L2,R2] be two inputs 
of a 5 round Feistel scheme, and let [S'!,?!] and [52,72] be the outputs. Let 
assume that the round functions f2 and f^ are permutations (therefore they are 
not random functions of Fn). Then if Ri = R2 and Li yf L2 it is impossible to 
have simultaneously Si = S2 and Li 0 L 2 = 7i 0 T 2 - 

Proof. 

= i ?2 Ail 0 A 2 = Ti 0 L2, and 5i = ^2 Fi 0 I 2 = Ti 0 T2. Therefore 
if we have Li 0 T 2 = 7i ® 72; we will have also: 

Xi®Yi=X2 ® V2. 

Now since we have Yi = Xi (B fz{Pi), we will have fsiPi) = fz{P2) and since /s 
is a permutation we get Pi = P2. 

Then since we have = i?j 0 f2[Li 0 fi{Ri)] with Ri = R2, and since /2 is a 
permutation we get 

Li 0 /i(7?i) = L 2 ® /i( 7?2)- 
This is in contradiction with Ri = R2 and Li yf T 2 . 
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Attacks on 5 Round Feistel Schemes with /2 and fs Permutations 

From the above Theorem 3 we see that our attack given in section 4 and 5 
against most 5 round Feistel schemes will fail when /2 and /s are permutations. 
Indeed, the event Ri = R 3 , Li ^ L^, Si = S 3 and Li 0 L3 = Ti 0 T 3 will never 
occur if /2 and /a are permutations. However, in such a case there is an even 
simpler attack that comes immediately from the Theorem 3: we can randomly 
get m input/output values and count the number of indices < j such 

that: 



\ Ri = Rj 
{ Si = Sj 

[ Li 0 Lj = Ti 0 Tj 

For a random permutation this number is and for a 5 round Feistel 

scheme with /2 and /a being permutations, it is exactly 0. 

This attack requires 0{2 ^ ) random plaintext/ciphertext pairs and 0{2 ^ ) com- 
putations. 

Remark: This attack can also be extended to 6 round Feistel schemes when the 
round functions are permutations (or “quasi-permutations”), see [2,3] for details. 

Conclusion: It was known (before the present paper) that some generic attacks 
on 5 round Feistel schemes exist when the round functions are permutations. This 
particular case is interesting since two of the former AES candidates, namely 
DFC and DEAL, were such Feistel schemes using permutations as round func- 
tions. (More precisely they were “quasi-permutations” in DFC). The number of 
rounds in these functions is however > 6. 

In this paper we have shown a more general result that such generic attacks 
exist for most of 5 round Feistel schemes (even when /2 and /a are not per- 
mutations). It can be noticed that our attack is based on specific relations on 4 
points (corresponding to 4 ciphertexts), while the previous attacks were based 
on specific relations on only 2 points (’’impossible differentials”). 



7 Attacking Feistel Generators 

In this section we will describe what is an attack against a generator of per- 
mutations (and not only against a single permutation randomly generated by a 
generator of permutations), i.e. we will be able to study several permutations 
generated by the generator. Then we will evaluate the complexity of brute force 
attacks and we will notice that since all Feistel permutations have an even sig- 
nature, it is possible to distinguish them from a random permutation in 0(2^”). 

Let G be a “k round Feistel Generator”, i.e. from a binary string K, G gen- 
erates a k round Feistel permutation Gk of H 2 „. 

Let G' be a truly random permutation generator, i.e. from a string K, G' gen- 
erates a truly random permutation G'j^ of B^n- 
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Let G” be a truly random even permutation generator, i.e. from a string K, G" 
generates a truly random permutation of A 2 n, with A 2 n being the group of 
all the permutations of i? 2 n with even signature. 

We are looking for attacks that distinguish G from G', and also for attacks 
that will distinguish G from G". 

Adversarial model: An attacker can choose some strings Ki, . . . Kf, can ask for 
some inputs [Li,Ri] G l 2 n, and can ask for some Gk^ILi, Rt] (with being 
one of the Ki). Here the attack is more general than in the previous sections, 
since the attacker can have access to many different permutations generated by 
the same generator. 

Adversarial goal: The aim of the attacker is to distinguish G from G' (or from 
G") with a good probability and with a complexity as small as possible. 

Brute force attacks. A possible attack is the exhaustive search on the k round 
functions fi, . . . , fk form I„ to /„ that have been used in the Feistel construction. 
This attack always exists, but since we have 2^ ”'^ possibilities for fi, . . . , fk, 
this attack requires about computations (or 2 r 2 l "'2 computations in a 

version “in the middle” of the attack) and about k • 2”“^ random queries^ and 
only 1 permutation of the generator. 

Attack by the signature. 

Theorem 4 If n>2 then all the Feistel schemes from l 2 n hn have an even 
signature. 

Proof. 

Let (J ! I 2 n ^ h 2 n 
[L,R]^[R,L]. 

Let fi be a function of F„. 

Let r{h)[L,R] = [L(Bfi{R),R]. 

We will show that both cr and 'P'{fi) have an even signature, so will have cr o 
'F'(fi) = '^'(/i), and thus by composition, all the Feistel schemes from / 2 „ ^ l 2 n 
have an even signature. 

For a: All the cycles have 1 or 2 elements, and we have 2” cycles with 1 element 
(and an even signature), and - — cycles with 2 elements. When n > 2 this 
number is even. 

For Fffi): All the cycles have 1 or 2 elements since 'F'(fi) o >F'(/i) = Id. 
Moreover the number of cycles with 2 elements is with k being the number 
of values R such that fi{R) yf 0. So when n > 2 the signature of 'F'ifi) is even. 

Theorem 5 Let f be a permutation of B 2 n- Then using G(2^”) computations 
on the 2^” input/output values of f, we can compute the signature of f. 

^ each query divides by about 2^” the number of possible fi, . . . , fk 
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Proof. 

oc 

Just compute all the cycles Cj of /, / = H Ci and use the formula: 

i=l 

signature(/) = f[(-iyength(a)+i ^ 

i=l 

Theorem 6 Let G he a Feistel scheme generator, then it is possible to distin- 
guish G from a generator of truly random permutations of B 2 n after 0(2^”) 
computations on 0(2^") input/output values. 

Proof. 

It is direct consequence of the Theorems 4 and 5 above. 

Remark. 

It is however probably much more difficult to distinguish G from random per- 
mutations of A 2 n, with A 2 n being the group of all the permutations of i? 2 n with 
even signature. In the next sections we will present our best attacks for this 
problem. 

8 An Attack on 6 Round Feistel Generators in 0 { 2 ‘^'^) 

Attacks on 6 round Feistel. If G is a generator of 6 round Feistel permutations 
of i? 2 n, we have found an attack (described below) that uses a few (i.e. 0{1)) 
permutations from the generator G, G(2^”) computations and about G(2^”) 
random queries. So this attack has a complexity much smaller than the exhaus- 
tive search in . However since a permutation of i? 2 „ has only 2^" possible 

inputs, this attack has no real interest against a single specific 6 round Feistel 
scheme used in encryption. 

It is interesting only if a few 6 round Feistel schemes are used. This can be par- 
ticularly interesting for some cryptographic schemes using many permutations 
on a relatively small number of bits. For example in the Graph Isomorphism 
authentication scheme many permutations on about 2^'^ points are used (thus 
n = 7), or in the Permuted Kernel Problem PKP of Adi Shamir many permu- 
tations on about 2® points (n = 3 here). Then, we will be able to distinguish 
these permutations from truly random permutations with a small complexity if 
a 6 round Feistel scheme generator is used. And this, whatever the size of the 
secret key used in the generator may be. So we do not recommend to generate 
small pseudorandom permutations from 6 round Feistel schemes. 

The Attack: 

Let [Li, Ri] be an element of L 2 n- 
Let 'F^[Li,Ri\ = [Si,Ti\. The attack proceeds as follows: 

Step 1. 

We choose specific permutation / = Gk- 
We generate m values f[Li, Ri] = [Si,Ti], 1 < i < m with the random [Li,Ri] G 
L 2 n and with m = G(2^”). 

Remark: Since m = G(2^"), we cover here almost all the possible inputs 
[Li,Ri] for this specific permutation /. 
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Step 2. 

We look if among these values we can find 4 pairwise distinct indices denoted 
by 1,2, 3, 4 such that these 8 equations are satisfied: 



( i?i — i?3 



(#) 



i?2 = R 4 
S^=S2 
S 3 = S^ 

L\ © -L3 = 
L\ © L3 = 



L2 © JI4 

5i©^3 



Ti © T2 = T3 © T 4 

<Ti (B T 2 = R\ © i?2 



(and with R 2 ^ Ri, S 3 ^ Si and Ti yf T 2 ). 



S,RBT 



S,RBT 



r,lbs r,lbs 



Fig. 3. A representation of the 8 equations # in L, S, R, T 



It is also possible to show that all the indices that satisfy these equations can 
be found in 0{m) and with 0{m) of memory. We count the number of solutions 
found. 



Step 3. 

We try again at Step 1 with another / = Gk' and we will do this a few times, 
say A times with A = 0(1). Let a be the total number of solutions found at Step 
2 for all the A functions tested. It is possible to prove that for a generator of 
pseudorandom permutation of i? 2 n we have 

Am^ 



Moreover it is possible to prove that for a generator of 6 round Feistel schemes 
the average value we get for a is 



2Am^ 



a > about 



28 n 
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Proof. 

The proof is very similar to the proof we did for (due to the lack of space 
we do not explicit it here). 

So by counting this value a we will distinguish 6 round Feistel generators from 
truly random permutation generators each time when is not negligible, for 
example when A = 0(1) and m = 0(2^”), as claimed. 

Examples: Thus we are able, to distinguish between a few 6 round Feistel per- 
mutations taken from a generator, and a set of truly random permutations (or 
from a set of random permutations with an even signature) from 32 bits to 32, 
within approximately 2^^ computations and 2^^ chosen plaintexts. 



9 An Attack on k Round Feistel Generators 



It is also possible to extend these attacks on more than 6 rounds, to any number 
of rounds k. However for more than 6 rounds, as already for 6 rounds, all our 
attacks require a complexity and a number of queries > 0(2^”), so they can 
be interesting to attack generators of permutations, but not to attack a single 
permutation (the probability of success against one single permutation is gener- 
ally negligible, and we need a few, or many permutations from the generator, in 
order to be able to distinguish the generator from a truly random permutation 
generator) . 



Example of attack on a Feistel generator with k rounds. Let k be an integer. 
For simplicity we will assume that k is even (the proof is very similar when k is 
odd). Let A = | — 1. Let G be a generator of Feistel permutations of k rounds of 
B 2 n- We will consider an attack with a set of equations in (L, R, S, T) illustrated 
in figure 3. For simplicity we do not write all the equations explicitly. 

Here we have /r = A^ = (| — 1)^ indices, and we have 4A(A— 1) = k"^ — 6k + 8 
equations in L, R, S, T. Here it is possible to prove that the probability that the 
4A(A — 1) equations of figure 3 exist, will be about twice for a Feistel scheme 
with k rounds, than for a truly random permutation. 

Thus, on a fixed permutation this attack succeeds with a probability in 



O 



/ jjii% 1 )^ 

1 2n-4A(A-l) 



If we take m = 0(2^”) for such a permutation, it gives a probability of success 
in 

^ 1 2" (fc^-6fc-l-8) I 

So we will use G(2"(V“4fe-i-6)^ permutations, and the total complexity and the 

total number of queries on all these permutations will be G(2"('^“^^+®)). The 
total memory will be 0(2^"). 
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A points 




S,R®T 

S,R®T 

S,R®T 

S,R®T 



R,L®S ^ 

R,LS)S 



R,L®S 



Fig. 4. Modelling the 4 • A(A — 1) equations in L, R, S, T. 



Examples: 

— With k = this attack uses 0(1) permutations and 0(2^”) computations 
(exactly as we did in section 8). 

— With fc = 8 we need 0(2®") permutations and 0(2®") computations. 



10 Conclusion 

Up till now, generic attacks on Feistel schemes were known only for 1,2,3 or 
4 rounds. In this paper we have seen that some generic attacks also do exist 
on 5 round Feistel schemes. So we do not recommend to use 5 round Feistel 
schemes in cryptography for general purposes. Our first attack requires 0(2^) 
random plaintext/ciphertext pairs and the same amount of computation time. 
Our second attack requires 0{2^) chosen plaintext/ciphertext pairs and the 
same amount of computation time. For example, it is possible to distinguish most 
of 5 round Feistel ciphers with blocks of 64 bits, from a random permutation 
from 64 bits to 64 bits, within about 2"^® chosen queries and 2^® computations. 

We have also seen that when we have to generate several small pseudo- 
random permutations we do not recommend to use a Feistel scheme generator 
with only 6 rounds (whatever the length of the secret key may be) . As an exam- 
ple, it is possible to distinguish most generators of 6 round Feistel permutations 
from truly random permutations on 32 bits, within approximately 2®^ computa- 
tions and 2®^ chosen plaintexts (and this whatever the length of the secret key 
may be). 

Similar attacks can be generalised for any number of rounds fc, but they require 
to analyse much more permutations and they have a larger complexity when fc 



increases. 
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Abstract. Compact and high-speed hardware architectures and logic 
optimization methods for the AES algorithm Rijndael are described. 
Encryption and decryption data paths are combined and all arithmetic 
components are reused. By introducing a new composite field, the S-Box 
structure is also optimized. An extremely small size of 5.4 Kgates is ob- 
tained for a 128-bit key Rijndael circuit using a 0.11-fim CMOS standard 
cell library. It requires only 0.052 mm^ of area to support both encryp- 
tion and decryption with 311 Mbps throughput. By making effective use 
of the SPN parallel feature, the throughput can be boosted up to 2.6 
Gbps for a high-speed implementation whose size is 21.3 Kgates. 



1 Introduction 

DES (Data Encryption Standard) [14,1], which is a common-key block cipher 
for US federal information processing standards, has also been used as a de 
facto standard for more than 20 years. NIST (National Institute of Standard 
Technology) has selected Rijndael [2] as the new Advanced Encryption Standard 
(AES) [13]. Many hardware architectures for Rijndael were proposed and their 
performances were evaluated by using ASIC libraries [8,18,10,9] and FPGAs [3, 
17,6,11,5]. However, they are simple implementations according to the Rijndael 
specification, and none are yet small enough for practical use. The AES has to be 
embeddable not only in high-end servers but also in low-end consumer products 
such as mobile terminals. Therefore, sharing and reusing hardware resources, 
and compressing the gate logic are indispensable to produce a small Rijndael 
circuit. 

The SPN structure of Rijndael is suitable for highly parallel processing, but 
it usually requires more hardware resources compared with the Feistel structure 
used in many other ciphers developed after DES. This is because, all data is 
encoded in each round of Rijndael processing, while only half of data is processed 
at once in DES. In addition, Rijndael has two separate data paths for encryption 
and decryption. 

In this paper, we describe a compact data path architecture for Rijndael, 
where the hardware resources are efficiently shared between encryption and de- 
cryption. The key arithmetic component S-Box has been implemented using 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 239-254, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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look-up table logic or ROMs in the previous approaches, which requires a lot 
of hardware support. Reference [16] proposed the use of composite field arith- 
metic to reduce the computation cost of the S-Box, but no detailed hardware 
implementation was provided. Therefore, we propose a methodology to optimize 
the S-Box by introducing a new composite field, and show its advantages in 
comparison to the previous work. 



2 Rijndael Algorithm 

Fig. 1 shows a Rijndael encryption process for 128-bit plain text data string and 
a 128-bit secret key, with the number of rounds set to 10. These numbers are 
used throughout this paper, including for our hardware implementation. Each 
round and the initial stage requires a 128-bit round key, and thus 11 sets of round 
keys are generated from the secret key. The input data is arranged as a 4 x 4 
matrix of bytes. The primitive functions SubBytes, ShiftRows and MixColumns 
are based on byte-oriented arithmetic, and AddRoundKey is a simple 128-bitwise 
XOR operation. 

SubBytes is a nonlinear transformation that uses 16 byte substitution ta- 
bles (S-Boxes). An S-Box is the multiplicative inverse of a Galois field GF{2^) 
followed by an affine transformation. In the decryption process, the affine trans- 
formation is executed prior to the inversion. The irreducible polynomial used by 
a Rijndael S-Box is 



m{x) = + x'^ + + X + 1. (1) 

ShiftRows is a cyclic shift operation of the last three rows by different offsets. 
MixColumns treats the 4-byte data in each column as coefficients of a 4-term 
polynomial, and multiplies the data modulo x^ + 1 with the fixed polynomial 
given by 



c{x) = {03}a;3 -k {01}x^ -k {01}x -k {02}. (2) 

In the decryption process, InvMixColumns multiplies each column with the poly- 
nomial 



c ^(x) = (0 B}x^ - k (ODja;^ -k (09}a; -k {0E| (3) 

and InvShiftRows shifts the last three rows in the opposite direction from 
ShiftRows. 

The key expander in Fig. 1 generates 11 sets of 128-bit round keys from one 
128-bit secret key by using a 4- byte S-Box. These round keys can be prepared on 
the fly in parallel with the encryption process. In the decryption process, these 
sets of keys are used in reverse order. Therefore, all keys have to be generated and 
stored in registers in advance, or the final round key in the encryption process 
has to be pre-calculated for on-the-fiy key scheduling. Because the first method 
requires the equivalent of a 1,408-bit register (128 bits x 11), and is not suitable 
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Fig. 1. Encryption process of Rijndael algorithm 



for compact hardware, the second approach was chosen for the implementation 
described in the next section. Rcon[i] in Fig. 1 is a 4-byte value, and the lower 
3 bytes are 0 for all *, and the highest byte is the bit representation of the 
polynomial a;* mod m{x). 



3 Data Path Architecture 

3.1 Data Path Sharing between Encryption and Decryption 

In order to minimize the size of our Rijndael hardware, resource sharing in the 
data path is fully employed as shown in Fig. 2. This circuit can execute both 
encryption and decryption. The 128-bit data (4x4 bytes) block is divided into 
four 32-bit columns, and is processed column by column through the 32-bit data 
bus. Therefore one round takes 4 clock cycles. It is not a good idea to make 
the bus width smaller than 32 bits, because the MixColumns operation needs 
32-bits of data at one time. A smaller bus requires more registers and selectors, 
and resource sharing is hindered, resulting in an inefficient implementation. 
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The “Enc/Dec block” has 16-byte data registers, and they execute ShiftRows 
(or InvShiftRows) operations by themselves. Each 4-byte column is transformed 
by four parallel S-Boxes as SubBytes (or InvSubBytes) . The order of ShiftRows 
and SubBytes is different from that in Fig. 1, though this does not affect the 
operations’ results. 

Selectors change the circuit state between encryption and decryption. The 
data path 



^ ^ and affine ^ MixColumns 

is selected for encryption, and the path 

affine”^ and ^ x~^ ^ InvMixColumns 

is used for decryption. <5“^ and S are isomorphism functions for field conversions. 
Details are described in Section 4. 

By moving InvMixColumns from the front of each S-Box to the back, Mix- 
Columns and InvMixColumns can be merged and some selectors are eliminated. 
As a result, the circuit size and the critical path length are reduced. An addi- 
tional InvMixColumns is required in the key expander, but the area impact is 
minor. 



3.2 S-Box Sharing with Key Expander 

The key expander reuses the S-Boxes in the encryption/decryption block to 
generate a 128-bit key in each round. The S-Boxes are used once by the key 
expander, and four times by the encryption/decryption block, for a total of five 
times in every round. While the key expander uses the S-Boxes, the ShiftRows (or 
InvShiftRows) operation is executed simultaneously. As shown in Fig. 1, only the 
AddRoundKey operation is executed in the initial round, and the MixColumns 
(or InvMixColumns for decryption) is omitted in the final round. This operation 
switching is carried out by controlling the 4:1 selector at the bottom of Fig. 2. 
The first round key used in AddRoundKey is the initial key data stored in the 
key registers, and a transformation with the S-Boxes is not necessary. Therefore 
the first round takes four cycles, and the entire encryption process takes 54 (= 
4 -I- 5 X 10) cycles. The decryption process also takes 54 cycles. When a new 
secret key is provided, the key expander takes 10 cycles to generate the initial 
decryption key, which is the final round key in the encryption. 

As described in Section 2, Rcon[f] is a 4-byte constant value, and the highest 
order byte is generated by modular multiplication on GF(2^). The circuit RC in 
Fig. 3 generates the constant values sequentially during the encryption process, 
starting from {01}, and RC“^ calculates the same values in reverse order from 
{36}. These circuits are also merged as shown in this figure. 
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Fig. 2. Data path architecture 



3.3 Factoring in MixColumns and InvMixColumns 

MixColumns and InvMixColumns are modular multiplications with constant 
polynomials (2) and (3) that can be written as the constant matrix multiplica- 
tions shown in Equations (4) and (5) respectively. 
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As seen in Equation (5), InvMixColumns contains a complete MixColumns 
matrix. Therefore we merged these two functions into one circuit as shown in Fig. 
4. In addition, both functions can be broken into regular matrices whose non- 
zero elements are only one of the values {08, 04, 02, 01}. Therefore the number of 
common terms can be greatly reduced by factoring, finally resulting in Equations 
(6) and (7). The result, shown in Table 1, is that the XOR logic gates are 
decreased by 2/3 (from 592 XORs to 195 XORs) with only 2 XOR gates of 
additional delay. 




MixColumns InvMixColumns 



Fig. 4. MixColumns/InvMixColumns circuit 



Table 1. Factoring effects of MixColumns and InvMixColumns 





1 Original Matrices 


Our 

Implementation 


MixColumns 


InvMixColumns 


Total 


Number of XOR 


152 


440 


592 


195 


Delay (gates) 


3 


5 


5 


7 



4 S-Box Optimization 

4.1 Structure of New S-Box 

Designing a compact S-Box is one of the most critical problems for reducing the 
total circuit size of the Rijndael hardware. It is possible to implement the S-Box 
as a practical circuit based on its functional specification by using automatic 
logic synthesis tools, because the size of the S-Box function table is small; 256 
entries x 1 byte. However, a significant reduction in the size of the S-Box was 
achieved in [16], by using composite field arithmetic [7]. In the following, we 
propose further optimization of S-Box by introducing a new composite field. 
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Fig. 5 shows the outline of our S-Box implementation. The most costly op- 
eration in the S-Box is the multiplicative inversion over a field A, where A is an 
extension field over GF(2) with the irreducible polynomial m{x). To reduce the 
cost of this operation, we adopted the following 3-stage method. 
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Fig. 5. The computation sequence of our S-Box implementation 



(Stage 1) Map all elements of the field A to a composite field B, using an 
isomorphism function S. 

(Stage 2) Compute the multiplicative inverses over the field B. 

(Stage 3) Re-map the computation results to A, using the function 5“^. 

Even though isomorphism functions are required in this method, the cost of 
those functions can mostly be hidden by merging them with the affine transfor- 
mations. 



4.2 Multiplicative Inversion over A New Composite Field 

The composite field B in Stage 2 is constructed not by applying a single degree-8 
extension to GF{2), but by applying multiple extensions of smaller degrees. To 
reduce the cost of Stage 2 as much as possible, we built the composite field B by 
repeating degree-2 extensions under a polynomial basis using these irreducible 
polynomials: 



r GF(22) :x‘^ + x + l 

iGF((22)2) ■_x‘^ + x + (j) 

[gF(((22)2)2) 



(8) 
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where (j) = {10}2, A = {1100}2- The inverter over the field above has fewer 
GF{2) operators compared with the composite field used in [16] 

fGF{2^) :x^ + x+l 

|Gf((2‘*)^) :x^ + a: + wi4 

where W14 = {1001}2- 

Our hardware implementation of Stage 2 is shown in Fig. 6. For any compos- 
ite fields GF((2’”)") which are constructed using a degree-n extension after a 
degree-TO extension, computing the multiplicative inverses can be done as a com- 
bination of operations over the subfields GF(2"), using the equation described 
in [7,4] 

where r = (2""* - l)/(2'" - 1). (10) 

In our case (n = 2, m = 4), so Equation (10) becomes 

= (pi7)-i . pl6^ (11) 

The circuit in Fig. 6 is an implementation of Equation (11), with additional 
optimizations. In the circuit, P^® is computed first (note that the hardware 
costs for computing 2-powers over Galois fields are very small) and then P^^ 
is obtained by multiplying P by P^® over GP(((2^)^)^). This operation re- 
quires only two multiplications, one addition and one constant multiplication 
over GP((2^)^). Because P^^ is always an element of GP((2^)^) according to 
Fermat’s Little Theorem (i.e., the upper 4 bits of P^^ are always 0), computing 
the upper 4 bits of P^^ is unnecessary [7]. (P^^)“^ is computed recursively over 
GP((2^)^), then multiplied by P^® over GP(((2^)^)^), and finally P~^ is ob- 
tained. This multiplication requires fewer circuit resources than usual, because 
P^^ is an element of GP((2^)^). Note that our multipliers and inverter over sub- 
field GP((2^)^) are also small [15]. Further gate reduction is possible by sharing 
parts of the three GP((2^)^) multipliers in Fig. 6, where common inputs are 
used. 



common input 




Fig. 6. Our implementation of an inverter over a composite field GP(((2^)^)^). 
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4.3 Generating Isomorphism Functions 

The isomorphism functions S and <5“^ are located at the both ends of the S- 
Boxes, and one of them is merged with an affine transformation. On the other 
hand, Reference [16] proposes locating these isomorphism functions at the cir- 
cuit’s primary input and output, and thus it cannot be merged with an affine 
transformation. The function is also required between the AddRoundKey and 
the key expander in Fig. 1. Therefore our approach is much more suitable for 
a reduced hardware implementation. The isomorphism functions 6 and 5“^ in 
Stages 1 and 3 were constructed as follows. First, search for a generator element 
a in A and a generator f3 in B, where both a and f3 are roots of the same prim- 
itive irreducible polynomial. Any primitive polynomial can be applied, and here 
we use 



p{x) = + 1. 



( 12 ) 



Once such elements are found, the definition table of the isomorphism func- 
tion 5 (or is immediately determined, where is mapped to (3^ (or (3^ 
to for any 1 < fc < 254. The hardware implementation of these functions 
can be obtained by mapping only the basis elements of A (or B) into B (or A), 
and these mappings are described as multiplications of constant matrixes over 
GF{2). The functions 6 and are as follows: 
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where the least significant bits are in the upper left corners. 

All of these isomorphism functions and the constant multipliers in the S- 
Boxes are implemented as XOR arrays, and their Boolean logic is compressed 
by applying a factoring technique based on a greedy algorithm [12]. 



4.4 Implementation Results of the S-Box 

Table 2 shows the performance of our multiplicative inverter and S-Box described 
above in comparison with that using Equation (9). The S-Box implementations 
are also compared with the one automatically generated by a synthesis tool from 
a look-up table. A 0.11-/rm CMOS standard cell library (one gate is equivalent 
to a 2-way NAND) is used here, and the delay time is evaluated under the worst- 
case conditions. The hardware size of our S-Box using the field GF(((2^)^)^) is 
294 gates, which is about 20% smaller and slightly faster than the one using the 
field GF((24)2). 
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Table 2. S-Box features of proposed method, (gate = 2-way NAND) 



Method 


Inverter 


S-Box 


S-Box‘ 


Area 

(gates) 


Delay 

(ns) 


Area 

(gates) 


Delay 

(ns) 


Area 

(gates) 


Delay 

(ns) 


Ours, Equation (8) 


173 


2.55 


294 


3.69 


e— merged 


Equation (9) 


241 


2.50 


362 


3.75 


e— merged 


Look-up Table 


- 


- 


696 


2.71 


700 


2.29 



Our S-Box consists of affine transformations, isomorphism functions, invert- 
ers and selectors, and can be applied to both encryption and decryption. On the 
other hand, the look-up table method requires two different circuits, an S-Box 
for encryption and an S-Box“^ for decryption. The S-Box tables appear as ran- 
dom numbers to CAD tools, and therefore logic compression is very hard. As a 
result, a large amount of hardware, 1,396 (= 696 -h 700) gates, is required for 
each one-byte S-Box based on the look-up table method, while our method is 
less than 1/4 of that size. 

By applying our new composite field, merging the isomorphism functions with 
affine transformations, using a factoring technique, and combining the encryption 
and decryption paths, a very small S-Box was produced. 



5 Performance Comparison in ASICs 

The architecture described in Section 3 has been implemented by using 0.11- 
/im CMOS technology, and the extremely small size of 5.4 Kgates was obtained 
with a 7.62-ns cycle time (131.24 MHz) under the worst-case conditions. The 
gate size of each component and the critical path delay are detailed in Table 
3 and Fig. 7, respectively. The function SubBytes (S-Box) occupies about 22% 
of the circuit area, and accounts for almost half of the delay time. The second 
major component is neither MixColumns nor AddRoundKey, but the selectors. 
The requirement to use selectors is not obvious from the Rijndael algorithm 
specification, where they appear as conditional branches and data selections. 
However, they require 1,099 (= 699 -I- 400) gates (20.36% of the circuit), because 
of the wide data width. In order to drive those selectors, drivers with high fan 
out are also required. Therefore, we carefully analyzed the critical data path and 
optimized the order of data selection, and adjusted the driver size. As a result, 
the delay time of the selector and driver section was reduced from 3.46 ns down 
to 1.95 ns, without changing the total gate count. 

Using our proposed architecture, we designed and synthesized five imple- 
mentations as shown in Fig. 8. Higher throughputs with higher parallelism were 
achieved by increasing the number of S-Boxes and the bus width. Four S-Boxes 
are shared between the data encryption block and the key expander in the 5- and 
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Fig. 7. Critical path delay 

Table 3. Factoring effects of MixColumns and InvMixColumns 



Components 


Gates 


% 


Encryption/Decryption Block 


(3,305) 


(61.23) 


Data Register 


864 


16.01 


ShiftRows 


160 


2.96 


SubBytes 


1,176 


21.79 


MixColumns/InvMixColumns 


350 


6.48 


AddRoundKey 


56 


1.04 


Selector 


699 


12.95 


Key Expander 


(1,896) 


(35.12) 


Key Register 


864 


16.01 


InvMixColumns 


294 


5.45 


RC/RC‘ 


100 


1.85 


XOR 


238 


4.41 


Selector 


400 


7.41 


Controller, Selector, Driver 


197 


3.65 


Total 


5,398 


100.00 



3-cycle/round versions. In the other three implementations, the key expanders 
have their own S-Boxes. Two circuits were synthesized from each implementation 
(a total of ten implementations), one optimized for size and the other for speed. 
The sizes and speeds are also shown in Table 4, in comparison with other ASIC 
implementations [8,18,10,9] under the worst-case conditions. Data and the key 
sizes are both 128 bits in all implementations, except that of [10], where 128-bit 
data and a 256-bit key (14 rounds) are used. A gate wireability of 80% is as- 
sumed to calculate the silicon area of our implementations. Reference [16] shows 
a throughput of 7.5 Gbps with 32 parallel cores, with a circuit size for encryption 
of 256 Kgates. However, this number was not evaluated by any synthesis tool, 
so we did not include it in the table. 

It is obvious that in our implementation that more hardware resources yield 
higher throughput. For instance, the number of operation clock cycles can be 
reduced by increasing the size of the S-Box, which allows more parallel com- 
putation. Increases in fan out can also be used to increase the speed. In order 
to clarify the total efficiency of each implementation, we show the throughput 
per gate on the right side of Table 4. In general, it is not easy to compare the 
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Fig. 8. Data Path Architectures of each implementation 



implementations using different CAD tools and different technology libraries. 
However, even considering this difficulty of precise comparison, our hardware 
architecture is by far the best. Our smallest implementation is less than 1/6 of 
the 33.8 Kgates in the best previous approach [9]. Our 1-cycle/round version 
sets a new record for throughput at 2.6Gbps, not only in ECB mode but also in 
CBC cipher feedback mode. Reference [8] shows the second best throughput as 
1.95 Gbps, but it uses 113.5 times as many gates, because all 11 rounds are un- 
rolled. Throughout these comparisons of ASIC implementations, our hardware 
architecture has advantages in both size and speed. 

6 Conclusion 

In this paper, a compact yet high-speed architecture for Rijndael was proposed 
and evaluated through ASIC implementations. In order to minimize the hard- 
ware size, the order of the arithmetic functions was changed, and encryption and 
decryption data paths were efficiently combined. Logic optimization techniques 
such as factoring were applied to the arithmetic components, and gate counts 
were greatly reduced. 

Our architecture provides high flexibility from a compact 32-bit bus imple- 
mentation to a high-speed implementation using a 128-bit bus. The S-Box has 
been implemented as look-up table logic in the previous work, and has required 
extensive hardware resources. We introduced a new composite field GF(((2^)^)^) 
and proposed an optimization method for the S-Box. Our S-Box requires less 
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Table 4. Performance comparison in ASIC implementations, (worst case) 



Cycles 

/Round 


S-Box 

(bytes) 


Area 


Max. 

Freq. 

(MHz) 


Through 

-put 

(Mbps) 


Throughput 

/Area 

(Kbps/gate) 


Notes 


(gates) 


(mm^) 


Ours 0.11pm 


5 


■ 


5,398 


0.052 


131.24 


311.09 


57.63 


Total 54 cycles 


10,338 


0.099 


222.22 


526.74 


50.95 


4 


8 


6,292 


0.060 


137.55 


400.15 


63.60 


Total 44 cycles 


10,990 


0.106 


219.30 


637.96 


58.05 


3 


8 


7,998 


0.077 


137.17 


548.68 


68.60 


Total 32 cycles 


14,777 


0.142 


218.82 


875.28 


59.23 


2 


12 


8,836 


0.085 


137.17 


798.08 


90.32 


Total 22 cycles 


17,016 


0.163 


217.86 


1,267.55 


74.49 


1 


20 


12,454 


0.130 


145.35 


1,691.35 


135.81 


Total 11 cycles 


21,337 


0.205 


224.22 


2,609.11 


122.28 


[10] 0.18pm 


1 


48 


184,000 


4.23 


48 


435 

(256-bit 

key) 


2.51 


256-bit data and 
key supported 
Decryption not 
supported 


[9] 0.35 pm 


1 1 40 1 33,850 1 - 1 - 1 509.70 j 15.06 [ 


[8] 0.35 pm 


1/11 


[ 400 


[ 612,834 




[ 15.23 


[ 1,950.03 


[ 3.18 


1 1 round unrolled 



[18] 0.5 /rm 



1 


40 


68,872 


20.74 


21.18 


271.13 




4 transistors/gate 


1 


40 


160,421 


33.85 


47.36 


605.77 




is assumed 



than 1/4 the size of one using a look-up table, and also showed 20% better 
performance in comparison with the one using a GF((2‘^)^) field. 

Our smallest implementation using 0.11-/im CMOS technology is 5.4 Kgates, 
which is less than 1/6 the size of the best hardware of previous work. Making 
the best use of the parallel processing allowed by Rijndael, a high-speed version 
obtained the best performance of 2.6 Gbps with 21.3 Kgates. Thus, our Rijndael 
hardware can be applied to various targets from mobile equipment to high-end 
security servers. 

Our continuing research is to develop and evaluate even faster hardware for 
10 Gbps class high-speed communication links and beyond. 
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Abstract. Within the security architecture of the 3GPP system there is 
a standardised encryption mode /8 based on the block cipher KASUMI. 
In this work we examine the pseudorandomness of the block cipher KA- 
SUMI and the provable secnrity of /8. First we show that the three 
round KASUMI is not a psendorandom permutation ensemble but the 
four round KASUMI is a pseudorandom permutation ensemble under 
the adaptive distinguisher model by investigating the properties of the 
round functions in a clear way. Second we provide the upper bound on 
the security of /8 mode under the reasonable assumption from the first 
result by means of the left-or-right security notion. 



1 Introduction 

There is a standardised encryption algorithm /8 within the security architecture 
of the 3GPP(3rd Generation Partnership Project) system and this algorithm is 
based on the block cipher KASUMI that produces a 64-bit output from a 64-bit 
input under the control of an 128-bit key [12]. To guarantee the message confi- 
dentiality over a radio access link of W-GDMA IMT-2000, /8 encryption mode 
with KASUMI has been proposed. The purpose of this work is to investigate the 
pseudorandomness of the block cipher KASUMI and the provable security of /8. 

A block cipher can be regarded as a family of permutations on a message 
space indexed by a secret key. Luby-Rackoff[7] introduced a theoretical model 
for the security of block ciphers by using the notion of pseudorandom and super- 
pseudorandom permutations. A pseudorandom permutation can be interpreted 
as a block cipher that no attacker with polynomially many encryption queries 
can distinguish between the block cipher and the perfect random permutation. 
In [7], Luby and Rackoff used the DES-type transformation in order to construct 
a pseudorandom permutation from a pseudorandom function. They showed that 
the DES-type transformation with three rounds yielded 2n-bit pseudorandom 
permutation under the assumption that each round function was an n-bit pseu- 
dorandom function. Sakurai-Zheng[ll] showed that the three round MISTY-type 
transformation was not a pseudorandom permutation ensemble. MISTY-type 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 255-271, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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transformation[8,9] was another two-block structure different from DES-type. 
Recently, Gilbert-Minier [4] and Kang et al.[6] showed independently that the 
four round MISTY-type transformation was a pseudorandom permutation. 

The overall structure of KASUMI is the DES-type, but its round function FO 
composed of three round MISTY-type transformation which is not a pseudoran- 
dom function. Thus we cannot straightforwardly apply the Luby-Rackoff’s result 
to KASUMI. FO function within KASUMI has FI function as its component 
function which is composed of four round unbalanced MISTY-type transforma- 
tion. We show that this is a pseudorandom permutation. And we prove that the 
three round KASUMI is not a pseudorandom permutation but the four round 
KASUMI is a pseudorandom permutation. In [6], the authors investigated the 
pseudorandomness of KASUMI for non-adaptive distinguishers. In this paper we 
consider the security model for adaptive distinguishers similar to the approach 
of Naor and Reingold[10] and investigate the properties of the round function of 
KASUMI more precisely than the previous results like [4] and [6]. 

On the other hand /8 is one of the modes of operation for block ciphers. Sev- 
eral modes of operation for block ciphers have been proposed to encrypt plain- 
text blocks more than one block and to fulfil varying application requirements. 
As standardized modes of operation, ECB (electronic codebook), CBC(cipher 
block chaining), CEB (cipher feedback) and OFB (output feedback) are known [3]. 
3GPP /8 encryption mode can be seen as a variant of OFB mode. 

Proving the security of modes of operation started by Bellare et al.[l] in 
1994 who analyzed the security of GBG MAG mode. In 1997, Bellare et al.[2] 
introduced the security notions of the symmetric encryption scheme and proved 
the security of GTR mode and GBG mode. Recently, Alkassar et al.[13] analyzed 
the security of GFB mode and proposed the OGFB mode which improved the 
performance of GFB mode. In this paper we show that 3GPP /8 encryption 
mode is secure by means of the left-or-right security notion. To prove this fact 
we should have the assumption that the underlying block cipher KASUMI is 
secure. This assumption is reasonable since by the first our result we already 
obtain that KASUMI is a pseudorandom permutation ensemble. 

2 Pseudorandomness of the Block Cipher KASUMI 

2.1 Preliminaries 

Let In denote the set of all n-bit strings and be the set of all permuta- 
tions from I„ to itself where n is a positive integer. That is, = {tt : ^ 

In I 7T is a bijection}. We define an n-bit perfect random permutation as an 
uniformly drawn element of P„. 

Definition 1. is called the UP E (uniform permutation ensemble) if all per- 
mutations in Vn are uniformly distributed. That is, for any permutation it € V„, 

PrM = 2 ^. 

We consider the following security model. Let I? be a computationally un- 
bounded distinguisher with an oracle O. The oracle O chooses randomly a per- 
mutation 7T from the UPE Vn or from a permutation ensemble A„ C Vn- For 
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an n-bit block cipher, yl„ is the set of permutations determined by all the secret 
keys. The purpose of the distinguisher T> is to distinguish whether the oracle O 
implements the UPE Vn or yl„. 

Definition 2. Let V be a distinguisher, Vn be the UPE, and An be a permuta- 
tion ensemble obtained from a block cipher. Then the advantage ADVd of V is 
defined by 

ADVd = \Pr{T> outputs 1 \ O ^ Vn) — Pr{T> outputs 1 \ O ^ An)\ , 

where O ^ Vn and O ^ An denote that O implements Vn and An, respectively. 

Assume that the distinguisher T> is restricted to make at most poly(n) queries 
to the oracle O, where poly{n) is some polynomial in n. We call I? is a pseudo- 
random distinguisher if it queries x and the oracle answers y = where tt is 
a randomly chosen permutation by O. We say that I? is a super-pseudorandom 
distinguisher if it is a pseudorandom distinguisher and also makes a query y and 
receives x = Tr~^(y) from the oracle O. 

Definition 3. A function h :N is called negligible if for any constant c > 0 
and all sufficiently large n G N, h{n) < . 

Definition 4. Let An be an efficiently computable permutation ensemble. Then 
An is called a P PE (pseudorandom permutation ensemble) if ADVd is negligible 
for any pseudorandom distinguisher V. 

Definition 5. Let An be an efficiently computable permutation ensemble. Then 
we call An is a SPPE(super-pseudorandom permutation ensemble) if ADVd is 
negligible for any super-pseudorandom distinguisher V. 

In Definition 4 and 5, a permutation ensemble is efficiently computable if 
all permutations in the ensemble can be computed efficiently. See [10] for the 
rigorous definition of this. It is reasonable assumption that is an efficiently 
computable permutation ensemble if it is obtained from an n-bit block cipher. 
Hence we assume that any permutation ensemble obtained from a block cipher 
is efficiently computable. 

We define two transformations, DES-type and MISTY-type, which are ob- 
tained from two representative structures of current block ciphers. Let denote 
the set of all functions from /„ to itself. We call briefly / is an n-bit function (resp. 
permutation) where / G J>i(resp. / G Vn). 

Definition 6. For any n-bit function f G Vn, 2n-bit DES-type permutation 
Dy G V 2 n is defined by Df{L, R) = (R, L 0 f{R)), where L,R € In. 

Definition 7. For any n-bit permutation f G Vn, 2n-bit MISTY-type permuta- 
tion My G V 2 n is defined by My(L,i?) = {R,f{L) 0 R), where L,R G In- 

Several noticeable results about the pseudorandomness of DES-type and 
MISTY-type transformations are as follows. It is aware that PFE(pseudorandom 
function ensemble) can be similarly defined as Definition 4 by considering func- 
tion space instead of permutation space. 
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— D /2 o D/j is not a 2n-bit PPE and D /3 o D/j o is not a 2n-bit SPPE, 
although all fi’s{i = 1,2,3) are independently chosen from an n-bit PFE[7]. 

— D /3 o D /2 o D/j is a 2n-bit PPE and o D /3 o o is a 2n-bit SPPE 
if all fi’s{i = 1, 2, 3,4) are independently chosen from an n-bit PFE[7]. 

— M /3 o M /2 o M/j is not a 2n-bit PPE and o M /3 o M /2 o is not 
a 2n-bit SPPE, although each fi{i = 1,2, 3, 4) is chosen independently from 
an n-bit PPE [4, 11]. 

— M/^ o M /3 o M /2 o M /3 is a 2n-bit PPE and M/^ o M/^ o M /3 o M /2 o M/^ is 
a 2n-bit SPPE, where all fi’s{i = 1, 2, 3, 4, 5) are independently chosen from 
an n-bit PPE [4,5,6]. 

On the other hand KASUMI is a modified version of the block cipher 
MISTY1]9] and we can classify the permutation of KASUMI into the follow- 
ing three stages: 

— The overall permutation of KASUMI is a 64-bit permutation composed of 
the eight round DES-type permutation with the two round permutation FO 
and FL. 

— FO function is a 32-bit permutation composed of the three round MISTY- 
type transformation with the round permutation FL 

— FI function is a 16-bit permutation which is composed of the four round 
unbalanced MISTY-type transformation obtained from 7-bit S-box S7 and 
9-bit S-box S'9. 

First we show that FI function is a 16-bit PPE by examining the pseudo- 
randomness of unbalanced MISTY-type transformation. Second we prove that 
three round KASUMI is not a 64-bit PPE but four round KASUMI is a 64-bit 
PPE on the base of the first result. Note that FO function is not a 32-bit PPE, 
so it doesn’t seem that the three round DES-type permutation of KASUMI is a 
64-bit PPE as the Luby-Rackoff cipher. Since the EL function is to round key 
mixing, we can omit EL function in order to analyze the pseudoranomness of 
KASUMI. 



2.2 Pseudorandomness of the Unbalanced MISTY-Type 
Transformation 



We describe simple but useful two lemmas which their proofs are given in [6] . 
Lemma 1. Let it be a permutation chosen from the UPE P„. Then for any 
Xl ^ X2,y e In, 



Pr{Tr{xi) 0 7 t(x2 ) = y) 



2 ^ 

0 otherwise. 



Lemma 2. Let tti and 1^2 be two permutations independently chosen from the 
UPE Vn- Then for any a, b, c,d,yG In, 

Pr (7Ti(a) 0 7 Ti(6) 0 7T2(c) 0 TT2{d) = y) < , forn> 2. 
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Now we define two unbalanced MISTY-type transformations to examine ac- 
curately the pseudorandomness of FI function. 

Definition 8. Let n and m be two positive integers such that m <n. Then for 
any n-bit permutation f and m-bit permutation g, two {n + m)-bit unbalanced 
MISTY-type transformations M/ S Vn+m and Mg e Vn+m are defined by 

Mf{L,R) = {R,f{L)®R) elm>^ln, V(L,i?) € /„ x 

and 

Mg(L, R) = {R, g{L) (B R) & In Im , V(A, R) G Im In , 

where for any n-bit vector x, x denotes the m-bit value obtained by discarding 
the n — m most- significant end and for any m-bit vector y, y denotes the n-bit 
value obtained by adding n — m zero bits to the most- significant end. 

^te that the_F/ function of KASUMI can be represented as 16-bit permutation 
0 M /3 0 M /2 oM/j, where fi, /s are 9-bit permutations and / 2 , /4 are 7-bit 
permutations. The pseudorandomness of the FI function is guaranteed by the 
following theorem. 

Theorem 1. Let for any positive integer n and m such that m < n, fi, fs G 
Vn and / 2,/4 G Vm be independently chosen from two n-bit and m-bit PPEs, 
respectively. Then the four round unbalanced MISTY-type transformation o 
M/3 o M/2 o M/j is an {n-\- m)-bit PPE. 

Recall that a pseudorandom distinguisher T> can make query x and the or- 
acle O answers y = tt{x), where tt is a randomly chosen permutation by O. 
Now we assume that T> makes exactly q queries and refer to the sequence 
• • • , {x^'^\ y^'^'^)} of all query-answer pairs as the I?-transcript, where 
q = poly{n). We consider an adaptive pseudorandom distinguisher as the follow- 
ing definition. 

Definition 9. V is called an adaptive pseudorandom distinguisher if it has a 
transcript {{x^'^\y1^'>), ■■■ , and a function C-d of T> -transcript such 

that for every 2 <i < q, 

xW = Cp({(x«,y«), • • • , 



and 

the ouput ofV = C-D{{{x^^\y^^^),- ■ ■ , y^^^)}) . 

Under the adaptive distinguisher model, for any i-th query of T> is fully deter- 
mined by the first i — 1 query-answer pairs and T>'s output is a function of its 
transcript. Throughout this paper we assume that all queries are distinct. 

To prove the Theorem 1, we formally define a bad event and estimate its 
probability. 
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Definition 10. For any n-bit permutation fi and m-bit permu- 
tation f 2 , BAD{fi,f 2 ) is defined as the set of all T> -transcripts 
a = • • • , 2 /^'^))} satisfying: 31 < i < j < q such that 



fiix^l^) 0 = fiix^i’) © x^^ 



(i)^ 



fit) 



f2{x^R) 0 fiix’i^) 0 = f2{x^R) 0 fiix^l'’) 0 x^^ , 

where x*^*^ = (x^*\x^^) G In x Im for all 1 < i < q. 

Lemma 3. Let fi and f 2 be chosen independently from UPE Vn and UPE Vm, 
respectively. Then for any V-transcript a = {{x^^\ y^^^) , ■ ■ ■ , (x^^\ 2/*''^^)} and 
n > m > 2, 

Pr{a G BAD{fij 2 )) < ~ d) ( ^ 



Proof. By definition, a G BAD{fi,f 2 ) if there exist 1 < i < j < d such that 

either 

/i(x«)0x«=/i(xi^'))0x^') 



or 



72(3;^) 0 flix’l^) 0 X^^ = f2{x^R) 0 flix^l^) 0 X^^ 



For any given i and j, we estimate probabilities of these two events. We have 
the following three cases. 

Case 1 : x^^ x^^ and x^^ = x^^ Since fi is a permutation, 



Pr 



[fiix^L) ® = fi{x^l^) 0 x^^) = Pr (/i(x^*^) = /i(x^^^)) = 0 . 



Observe that, by the similar result to Lemma 1 



Pr ( f 2 {x^R) 0 fiix^L) 0 x^^ = f 2 {x^R) 0 fiixR^) 0 x^^ 



= Pr ( Mx^l) = fiix'i’) ) = 2 



(i)' 



•( 2 ” - 2 )! 2 " 



2 "! 



2 " - 1 



Case 2 : x^^ = x^^ and x^^ x^\ In this case the probability of the first event 

is equal to Pr(x^^ = x^^) = 0. By Lemma 1, the probability of the second event 
is estimated as 



Pr (/ 2 (x^^) 0 / 2 (x^^) 



— .T-b) m ™(i)A — 

— Xr W X^ I — 



1 



2m _ 1 



Case 3 : x^^ y^ x^^ and x^^ y^ x^7 By Lemma 1, the probability of the first 



event is estimated as 

Pr (^fiix^L) 0 Mx^l) = x^l> 0 ; 



,(i) 



2 " - 1 
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Similarly, by Lemma 2 , the probability of the second event is also estimated as 



Pr 



hi.x^L) ® hi.x^L) ® /2(a;fl ) ® hi.x^R) = ® 



< 



1 

2 ™ - 1 



since n > m > 2 . 

Hence, for any case, we obtain that 

Pr (/i( 4 *^) ® 



and 



Pr (j2{x^r) ® Mx^l) ® = /2(x^^) © /i(x^^^) © x^^^ < . 

Therefore 

Pr {a G BAD{fij 2 )) < ^ 2^^) ^ ° 



Definition 11 . Let An+m be the {n + m)-bit permutation ensemble obtained 
from An+ra{fi, f2, fs, U) = o M/3 o M/2 o M/j. Then and Ta„^^ 

are defined by the random variables such that is the V-transcript when 

the oracle O implements the UPE Vn+m and is the V-transcript when 

the oracle O implements the permutation ensemble An+m- 



Lemma 4 . Let An+m be the {n + m)-bit permutation ensemble of all 
2ln+m(/i,/2, fs, fi) such that /i , /s G Vn and f2,fi G Vm are independently 
chosen from the n-bit and m-bit UPEs, respectively. Then for any V-transcript 
a = ••• ,(x(®,j/(«))}, 

\Pr {PA„+m = fT- I cr ^ BAD{fi,f2)) - Pr {Tp„+„, = <^) I < Sn,m,q , 



where 



1 



n,m,q 2 "+™( 2 " - 1 )( 2 ™ - 1 ) • • • ( 2 " - (? + 1 )( 2 ™ - ? + 1 ) 

Proof. For any possible H-transcript we have that 



p (rp _ , { 2 ^+^-q)\ 

Pr [Tvn+m rrj 2"+™i 



Consider any specific n-bit permutation fi and m-bit permutation /2 such that 
a BAD{fi, f2). Note that T+„+„^ = ct if and only if for all 1 < t < g, = 
T„+^(x(*)). Since An+m = o M/3 o M/^ o M/^, 
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where {L2'' = ^/2 By definition of BAD{fi, f2), if a ^ 

BAD{fi, /2), then ^ and ^ for all 1 < i ^ j < <7. Therefore, 
since /a and /4 are independently chosen from the UPEs Vn and Vm, respectively, 
we obtain that 



Pr {Ta„^. 



cr I (T ^ BAD{fi,f2)) 



(2"-g)! {2^-qy. 
2^! 2^! 



which complete the assertion. □ 



Proof of Theorem T. It suffices to show the assertion under the assumption that 
fi, f 3 G Vn and f2, f 4 G Vm be independently chosen from two n-bit and m-bit 
UPEs, respectively. Let An+m be the (n + m)-bit permutation ensemble of all 
^n+m(/i, /2, fs, /4) = M o M o M o and 0 be the set of all T>-transcripts 
a such that the output of T> is C'u{a) = 1 . Then 

ADVv 

= \Pr {Cv{Ta^^J = l)-Pr {Cv{Tr^^J = l) | 

<Y,Pr{aiBAD{h,f2)) 

<7^0 

■ \Pr =a \ a i BAD{h,f2)) ~ Pr = a) \ ( 1 ) 

+ ^ Pr = a, a G BAD{hj2)) ( 2 ) 

o-e© 

+ ^ Pr (a G BAD{f , , /2)) • Pr = a) . ( 3 ) 

<re© 

By Lemma 4 , the term ( 1 ) is bounded above by e„,m,q and by Lemma 3 , the 
value of ( 3 ) is bounded by 

mjK;Pr(cr G BAD{fij2)) ■ Pr = a}) < {q^ - q) . 

On the other hand, by Lemma 3 , the value of ( 2 ) is estimated as 
^ Pr = a, a G BAD{f,j2)) 

(tGO 

= Y, Pr{TA„^r. = <J) -Pr {a G BAD{f,,f2) \ Ta^+^ = a) 

(tGO 

<(?’-<!) (2; + ^) ■ 

Therefore we can conclude that 

ADV-d < 2 {q^ - q) ^ + en,m,q , 



which is negligible. □ 
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2.3 Pseudorandomness of KASUMI 

From Theorem 1 , it becomes a reasonable assumption that FI function of KA- 
SUMI is a PPE. In order to investigate the pseudorandomness of KASUMI, we 
use a simplified figure of KASUMI. The four round simplified KASUMI is il- 
lustrated in Figure 1 , where x = {xi,X2,xs,X4) denotes a 4 n-bit input value, 
w = {wi,W2,W3,W4), y = (j/i,2/2,y3,?/4), and z = {zi,Z2,Z3,Z4) denote corre- 
sponding outputs of the two, three, and four round KASUMI, respectively. Each 
of Xi, Wi, yi, and Zi is an n-bit value. By the following theorem, we obtain the 
fact that three round of KASUMI is insufficient to be a PPE. 



Xi Xj X 3 X4 




Fig. 1. Simplified four round KASUMI 



Theorem 2. The three round simplified KASUMI is not a in-bit PPE though 
fi’sfi = 1 , ■ ■ ■ , 9 ) of Figure 1 are independently chosen from an n-bit PPE. 

Proof. Let A4n be the set of all permutations over Un obtained from the three 
round simplified KASUMI. Consider a distinguisher T> such as follows: 
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1. T> chooses four 4n-bit queries x^'^\ and x^^'> such that 
x^^'’ = (0, 0,X3,X4) , x^'^'> = (a;i,0,a;3,X4) , 



x^"^^ = (0,a;2,X3,X4) , x^"^^ = (xi, X2, X3, X4) , 

where Xi 0 ^ X2 and X3, X4 are fixed n-bit values. 

2. V sends these four queries to the oracle O and receives the corresponding 
answers {yiKy^K ysKyi'^)^^ = 1>2,3,4) from the oracle. 

3. T> outputs 1 if and only if 



yi^^ © yi^^ © yi^'> © yi^^ = 0 . 



If the oracle implements the UPE 7^4„, then we obtain that 

24"(24" - 1)(24” - 2)23"(24" - 4)! 



Pr{T> outputs 1\0 ^ VAn) < 



24n| 



2371 



< 



1 



24« _ 3 - 2"-i 



On the other hand, if O implements yl4„, then for x^4) = (g, 0, X3, 0:4), x^'^'> = 
(xi, 0, X3, X4), = (0, X2, a^3, 3:4), and a;^ 4 ) = (xi, X2, 3^3, X4), we can see from 

Figure 1 that the corresponding 2n-bit inputs of the second round are 



{Fi{x3,xa)\l,Fi{xz,xa)\r) , (Fi(x3,a;4)|L,a:i © Fi(a;3, cc4)|fl) , 



(a;2 © Ei(a;3,a;4)|L,Ei(x3,a:4)|fl) , (a:i © Fi{x3,xa)\l,X2 © Ei(x3, a;4)|fi) 

respectively, where Fi = M/3 o ° and {x\L,x\ii) denote the left and 

right n-bit block of 2n-bit value x. Thus we obtain by the similar argument of 
Sakurai-Zheng[ll] that 



y^2'’ ® V2^ ® V2^ ® = 0 



with probability 1. 

Consequently we obtain that 

ADV-d = \Pr{V outputs 1 | O ^ Pin) — Pr{V outputs 1 | O ^ 2l4„)| 



which is non-negligible. □ 



The following theorem guarantees that the four or more round KASUMI is 
a pseudorandom permutation ensemble. 

Theorem 3. If fi ’s(i = 1, 2, • • • , 12J m Figure 1 are independently chosen from 
an n-bit PPE, then the four round KASUMI is a An-hit PPE. 
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From Figure 1, we can see that the second round output W3 and W4 are 
depend on /i, • • • , /g and /i, • • • , f 5, respectively. So we set 

W3 = (x) and W4 = (x) , 

where x = (xi, X2, 2^3, CC4) G /4„ is an input value of KASUML As the similar 
work to previous section, we define bad event needed to prove Theorem 3. 

Definition 12. For every n-bit permutations fi,--- ,fe, ,/g) is 

defined as the set of all D-transcripts a = {(x^^\ • • • , (x ( 9 ) (<?))} satis- 

fying: 31 <i<j <q such that 

Lemma 5. Let /i, • • • , fe be chosen independently from UPE Vn- Then for any 
V -transcript a = • • • , y*^'^))}, 



Pr{aGBAD{h,--- , h)) < 



2 " - 1 



Proof. Let be the n-bit input value of fk{k = 1, • • • , 6 ) when the query of T> 
is x^*) = (xj* , X 2 *\ Xg \ X 4 ^)(z = 1, • • • ,q). For example, 






■*3 =4*^®/i(a;r)> 



© x^*^ © /i(x^*^) © /2(a:3 © fsix^^ © /i(a;4)) • 

Then it is easy to show that if yf for some fc = 1, • • • , 6 , by Lemma 1 



Pr (z(;|'’ '’*(xd)) = u>|i’ -’*(x(®)) < 



2 " - 1 ’ 



otherwise for all fc = 1, • • • , 6) we obtain that this probability is 

zero, since in this case ' ’■^®(xd)) = ' ’■^®(xd)) implies to x^) = 

which contradicts to the assumption that all queries are distinct. By the similar 
argument we can also show that Pr(w^fi' "' ’■^®(x*'d) = ’■^®(x^-l^) has the same 

upper bound. □ 

Lemma 6. Let A 4 „ be the 4n-bit permutation ensemble obtained from the four 
round KASUML of Figure 1 where all fi’sfi = I,-- - ,12J are independently 
chosen from the n-bit UPE. Then for any V-transcript a = {(x’^^\ y^^)), • • • , 

(x(9),y(9))}, 

\Pr (T^,„ =a \ a ^ BAD{h,- ■ ■ , /g)) - Pr = a)| < 4,, , 



n,q 24"(2” - 1)4 • • • (2” - g + 1)4 ■ 



where 
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Proof. For any possible P-transcript we have that 



Pr{Tv,„=a) 



(24" -g)! 

24n! 



In Figure 1, by considering four paths W 3 ^ fs ^ Z 2 , W 3 ^ fs ^ /lo ^ /12 
Z3, W4 ^ fr ^ fg ^ zi, and W4 ^ /7 ^ /g — > /n ^ Z4, we can obtain that 



Pr{TA,„=a\ aiBADih,--- ,/e)) = 



which complete the proof of this lemma. □ 



Proof of Theorem 3: From Lemma 5 and 6, Theorem 3 is proved straightfor- 
wardly by the similar process in the proof of Theorem 1. □ 



3 Provable Security of the Encryption Mode /8 

To guarantee the message confidentiality over the wireless link of W-CDMA 
for 3GPP, /8 encryption mode has been proposed, which is based on the block 
cipher KASUMI[12]. In this section we examine the provable security of the 
3GPP encryption mode /8 under the assumption that the underlying block 
cipher is a pseudorandom permutation. Note that this assumption is reasonable 
from the result of previous section. 

3.1 Notions of Security for a Symmetric Encryption Mode 

Symmetric encryption scheme is defined as a triple of algorithms, (/C, £,!?), 
where /C is the probabilistic algorithm for key generation, £ is the probabilis- 
tic algorithm which encrypts the plaintext M with the key K and outputs the 
ciphertext C, and T> is the the deterministic algorithm which decrypts the ci- 
phertext C with the key K and outputs the corresponding plaintext M . Here M 
is selected in a set of messages. Bellare et al.[2] considered four notions for secu- 
rity of symmetric encryption modes. “Real-or-random indistinguishability” and 
its variant “left-or-right indistinguishability” were first introduced. “Find-then- 
guess security” and “semantic security” which are the notions for the asymmetric 
encryption scheme, were adapted to the symmetric setting. They also investi- 
gated the relation among these notions of security [2]. Real-or-random and left- 
or-right indistinguishability were equivalent up to a small constant factor in the 
reduction. Also these notions had a security-preserving reduction to find-then- 
guess security. However the reduction from find-then-guess security to left-or- 
right indistinguishability was not security-preserving. It had security-preserving 
reductions between find-then-guess and semantic security. 

Here we analyze the security of 3GPP /8 mode by applying the notion of 
left-or-right indistinguishability, since the left-or-right security implies good re- 
ductions to the other three definitions as described above. Left-or-right indis- 
tinguishability is a strong form of chosen-plaintext security. It considers two 
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different games. In either game a query is a pair (xi,X 2 ) of equal-length strings 
from the given message space. In either game a random key a € K is selected 
at random and fixed for duration of the game. In Game I, an oracle receiving 
(xi,X 2 ) responds with Sa{xi). In Game 2, it responds with Sa{x 2 )- Thus Game 
I provides a “left” oracle and Game 2 provides a “right” oracle. An encryption 
scheme is secure if a reasonable adversary cannot obtain significant advantage 
in distinguishing Game 1 and 2. 

Definition 13 (Left-or-right indistinguishability[2]). Encryption scheme 
{1C, £, T>) is said to be {t, q, p,; e)-secure, in left-or-right sense, if for any adversary 
A who runs in time at most t and makes at most q oracle queries, totaling at 
most p bits. 



ADV^J Pva^K (^A^AOiiA,-))) = l) - Pra^K (^A^A0{2, {■,■))) ^ 



< e 



Encryption scheme {1C,£,'D) is {t,q, p-,e) -break, in left-or-right sense, if for an 
adversary A who runs in time at most t and makes at most q oracle queries, 
totaling at most p bits, ADVjf > e. 

In the above definition y[i^a(0(2, (•. ))) indicate A with an oracle 

O which returns y = £a{xi) and y = £a{x 2 ), respectively, in response to query 
{x\,X 2 )- And ’ = 1) (i = 1,2) denotes the probability that 

the adversary A with an oracle 0{i, (•,•)) (* = 2) outputs 1 when a key a is 

chosen randomly from the key space K . 

The encryption mode /8 is based on the block cipher KASUMI and this is 
a pseudorandom permutation ensemble by referring to last section. Let Bi be 
the function family obtained from a block cipher with Lbit input/output values. 
To analyze the provable security of /8 mode, we need more rigorous definition 
about PPE than Definition 4. 

Definition 14. A permutation family Bi is said to be a {t,q;e)- secure PPE if 
for any distinguisher V who makes at most q oracle queries and runs in time at 
most t, ADVx> < e. 



3.2 Security of /8 Encryption Mode 

In this subsection, we prove the security of 3GPP /8 encryption mode by using 
the notion of left-or-right security. The underlying function of the encryption 
mode is fixed to a PPE Bi with Lbit input/output length. Let a G AT be the key 
shared between the two parties who run the encryption scheme. It will be used 
to specify the function g = Bi[a] and g' = Bi[a(B KM] determined by the key a 
and a® KM, respectively, where KM is an 128-bit fixed constant. We describe 
rigorously the encryption mode /8 as the following scheme. This scheme is also 
illustrated in Figure 2. 

The scheme f8^{x) works as follows: 

Function fS^{x) 

IV ^ g'{Count\\Direction\\Bearer\\0 . . .0) 




268 J.-S. Kang et al. 



Regi = IV 
for z = 1, . . . , n do 
o, = g{Regi) 

yi = Oi® Xi 

Regi+i = IV 0 z 0 Oj 
return (z/i . . .z/„) 



c 'MMJl Bfarer ft... 0 





Fig. 2. 3GPP /8 encryption mode 



In the above scheme Count is an encryption sequence number of 32-bit length 
depending on the time, Bearer is a 5-bit bearer identifier, Direction is an 1-bit 
direction identifier, and 0 ... 0 denotes the padding so that the length of the 
input is an l-hit. The difference between OFB and /8 mode is that an initial 
nonce ctr = {Count\\Direction\\Bearer\\0 ... 0) is not sent to the receiver and 
g'{ctr) is applied to the underlying function g, instead of ctr in a cleartext. 

We consider two function family. /8^‘ is the set of all functions /8®, where 
g is chosen from the UPE Vi, and /8®' is the set of all functions /8®, where 
g is chosen from the PPE Bi. We first derive an upper bound on the success 
of any adversary trying to break the /8^‘ in the left-or-right sense. Next we 
examine the security of /8®b The basic idea for proving the security of /8 is 
that left-or-right security breaks down at the first repetition of the value of Reg. 
If Regi = Regj for z yf j, then also Oi = Oj. Hence yi 0 yj = ® x'j (6 = 1, 2). 

Thus b is revealed if x} ®x\^xj®x]. 
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Lemma 7. Let A be any adversary attaeking in the left-or-right sense, 

making at most q queries, totaling at most /x bits. Then 



ADVX < 



def yi/l • (/x/Z - 1) 
- 



Proof. Let {x\,x\), . . . , (Xg,Xg) be the oracle queries of the adversary A, each 
consisting of a pair of equal length messages. These queries are random variables 
that depend on the coin tosses of A and responses of the oracle to previous 
queries. Let ctvi = {Counti\\Directioni\\Beareri\\0 . . .0) and IVi = g'{ctri) € 
{0, 1}*, associated to (x^,xf) as computed by the oracle, for z = 1, . . . ,q. Let Ui 
be the number of blocks in the z-th query, x^ = xj[l] • • • x^[ni] {b € {1, 2}) be 
the z-th query message, and yt = yi[l] - • • yi[ni] be the response of the oracle to 
the z-th query message. Regi = Regi[l] ■ ■ ■ Regi[ni] is the contents of the register 
Reg in the z-th query, where Regi[j] {j € {1, ■ . . , zzx}) denotes the content of the 
register corresponding to the j-th block of the z-th query message. We set Oi [j] 
is a value computed by applying Regi[j] to the function g. Let T’z’i(-) denote 
the probability in Game 1 providing the adversary A with the left oracle, and 
Pt 2 {') denote the probability in Game 2 providing the adversary A with the 
right oracle. 

Let C be the collision event, i.e., Regi[k] = Regj[k'\ whenever (z, k) yf (j, k'), 
for all z,j = 1,... ,q and k = 1,... ,Ui and k' = 1,... ,rzy. The event 
complement of C, depends on IVi, Oi[k] and k for each query. Since g and g' 
are chosen from the UPE Vi, IVi and Oi[k] are random and independent of the 
message given to the oracle. Thus the collision probability does not depend on 
b, and the following equation holds: 



Pn(C") = Pr2(C") . 



(4) 



For the same reason, if no collision occurs, the adversary outputs 1 with the 
same probability for Game 1 and Game 2 because each ciphertext block given to 
the adversary is independent of any previous ciphertext blocks and of message 
blocks. Namely, the following holds: 

Pri {A=l\ G") = Pr2 {A=l\ C^) . (5) 

Therefore, by using the equation (4) and (5), we can write the adversary’s ad- 
vantage as follows: 

ADVX = \Pri{A = 1) - Pr 2 {A = 1)| 

= \Pri{A = 1 I G) • Pri(G) + Pri{A=l\ G=) • Pri(G‘’) 

-Pr 2 {A = 1 I G) • Pr 2 {C) -Pr 2 {A=l\ G") • Pr 2 (G")| 

= |(Pri(^ = 1 I G) - Pr2{A = 1 I G))Pn(G)| 

<Pn(G) . 

Given the equation (4) we drop the subscript in talking about the probability 
of G and write the above just as Pr(G). Now we want to compute the upper 
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bound of Pr{C). The adversary does not know the contents of the register Reg 
because she does not know IVi = g{ctri). Hence the adversary does not identify 
the collision Regi[k] = Regj[k'\ {{i,k) yf {j,k') for all i,j = 1 ,... ,q and k = 
1 , . . . ,ni and k' = 1, .. . ,rij). However the adversary knows the values Oi[k] since 
she knows the queried message block Xi[k] and the answered ciphertext block 
yi[k]. Then she can identity Oi[k] = oj[k']. Since g is a, permutation, the output 
collision, Oi[k] = Oj[k'], implies the following: 

Ot[k] = Oj[k'] ^ g{Regi[k]) = g{Regj[k']) ^ Regi[k] = Regj[k'] . 

Thus, to compute the upper bound of Pr{C) we compute the probability of 
the output collision event, T, i.e., Oi[k] = Oj[k'] whenever {i,k) yf {j,k'), for all 
i,j = l,...,q and fc = 1 , . . . , and fc' = 1 , . . . ,rij. We define the stream B as 

B = Oi[l] . . .0i[ni]02[l] . . .02[n2] . ■ ■Og[l] . ..Oq[nq], 



That is, B is the output values of g until the Uq-th encryption of the last g-th 
query. The length of B is Q = I- < g bits. We first compute the number 

of streams with a collision Oi\k] = Oj\k'] for every possible pair {i,k) and (j, A:') 
{{hk) yf {j,k'),l < ij < q,k = 1,... ,ni,k' = 1,... ,rij). As Oi[k] = Oj[k'], 
there are 2^ possible values for the both values. The remaining Q — 21 bits have 
2*3-2* possibilities. Thus the number of streams with a collision is 2‘^~k There are 
{g./l){fj,/l — l)/2 possible pairs (i, k) and (j, k'). Hence the number g of streams 
B with at least one collision is less than {g/l){g/l — l)2'5“*“i. The stream B 
has 2^ possibilities. Thus 



Pr{T‘^) 



{2<^-g) ^ {g/l){g/l-l) 

2Q - 2'+i 



This implies the following because of Pr{C) = Pr{T): 



Pr{C) < 



1 ) 

2*+i 



□ 



In the practical situation, because the underlying block cipher g is modeled 
as a pseudorandom permutation, we prove the security of 3GPP /8 mode using 
a pseudorandom permutation. This is derived from the Lemma 7. 



Theorem 4. Let Bi be a {t' ,q']e') -secure PPE with l-bit input/output length. 
Then /8®‘ scheme is {t,q, secure in the left-or-right sense. Here q = q' , 
/i = q'l, t = t' — Cj {I 1) and e = 2e' + , where c > Q is a small constant 

andSf - - 






2i+l 



Proof. The details of this proof are omitted since it is similar to the proof of 
Theorem 12 in [2] by replacing pseudorandom function with pseudorandom per- 
mutation. □ 
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4 Conclusion 

In this work we examined the pseudorandomness of the block cipher KASUMI 
and the provable security of /8. We proved that FI function within KASUMI 
composed of four round unbalanced MISTY-type structure was a pseudorandom 
permutation. And we showed that the three round KASUMI was not a permu- 
tation ensemble but the four round KASUMI was a pseudorandom permutation 
ensemble under the adaptive distinguisher model. Moreover we provided the 
upper bound on the security of /8 encryption mode under the reasonable as- 
sumption from the first result by means of the left-or-right security notion. 
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Abstract. In this paper, we consider the problem of mutually authen- 
ticated key exchanges between a low-power client and a powerful server. 
We show how the Jakobsson-Pointcheval scheme proposed recently [15] 
can be compromised using a variant of interleaving attacks. We also pro- 
pose a new scheme for achieving mutually authenticated key exchanges. 
The protocol is proven correct within a variant of Bellare-Rogaway 
model [3,4]. This protocol gives the same scalability as other public- 
key based authenticated key exchange protocols but with much higher 
efficiency and fewer messages. It only takes 20 msec total computation 
time on a PalmPilot and has only three short messages exchanged during 
the protocol. 



1 Introduction 

The goal of a mutually authenticated key exchange protocol (MAKEP) between 
two communicating parties is to provide them with some assurance that they 
know each other’s true identity and at the same time to have the two parties end 
up sharing a common key known only to them. This common key, also known 
as session key, can then be used to provide privacy and data integrity during 
the session. In this paper, we focus our attention on the design and analysis of 
MAKEPs for the two parties in which one of them is strictly limited in both 
computational power and memory capacity while the other is as powerful as a 
conventional desktop computer. We call the low-power party as the client and 
the powerful one as the server. Such a low-power client could be a Personal 
Digital Assistant (PDA), a cellular phone or a smart card in real applications. 
A powerful server could be a base station or the security center of a wireless 
network. 

Although there is a long history of designing MAKEPs and many protocols 
have been proposed for various kinds of distributed systems, they seldom de- 
signed for such an unbalanced system setup. For symmetric-key based MAKEPs, 
[21,22,3,4,16], two communicating parties share a long-lived key or have a third 
party involved during runtime. In the first case, each party has to maintain a 
set of distinct keys for communicating with different parties. In the later case, 
a centralized trusted party is required to be present whenever the protocol is 

* This work was sponsored by the U. S. Air Force under contract F30602-00-2-0518. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 272-289, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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executed. Hence key management and scalability are two major issues when de- 
ploying the schemes in practice. For public-key based MAKEPs, [9,7,1,H]) on 
the other hand, high computational complexity is required on both communicat- 
ing parties. For example, a 512-bit modular exponentiation on a 16MHz Palm 
V requires over one minute of pure computation as shown in [27]. 

Recently several schemes [26,15] have been proposed for systems with un- 
balanced compututional power. In [26], we proposed two MAKEPs which attain 
efficiency on the client side and provide scalability for most systems. However, 
the schemes do not give as much scalability as a pure public-key based MAKEP 
does. In [15], Jakobsson and Pointcheval proposed a MAKEP which improves 
on efficiency by using precomputation. However, the protocol does not scale well 
and is susceptible to a variant of interleaving attacks as shown in Sect. 3. 

In this paper, we propose a new scheme which not only gives the same seal- 
ability as other public-key based schemes do, but also requires only three short 
messages in a single protocol run. It takes only 20 msec computation time dur- 
ing a run of the protocol in the case where the client is a 16MHz PalmPilot. We 
also show that it is secure within a variant of Bellare-Rogaway model [3,4] and 
provides a reasonable amount of forward secrecy. 

The remainder of the paper is organized as follows. In Sect. 2, we describe 
some notations used throughout this paper. Then we present a variant of in- 
terleaving attacks and show how it can be used to compromise the MAKEP 
proposed in [15]. In Sect. 4, we introduce a new protocol referred to as the 
Client-Server MAKEP. A formal security analysis of the protocol is given in 
Sect. 5 and the performance is examined in Sect. 6. We conclude with some 
discussions on other properties of the protocol in Sect. 7. 

2 Preliminaries 

Let £k and denote the encryption and decryption transformations under the 
symmetric key K respectively. For public key systems, each entity A has a public 
key PKa and a private key SKa- For simplicity of notation, we use £pka to 
denote the public key encryption. SigTA is a secret signing algorithm and V ctta 
is the public verification algorithm of a trusted authority TA. A certificate Cert a 
of an entity A is denoted by 

Cert A = < ID^, TO, S'i 3 ta(IDa,to)>, 

where ID^ uniquely identifies A, and to is some message being certified. Usually 
TO is A’s public key with some other information such as a serial number and an 
expiration date. By < X >, we mean an appropriate encoding of A. A nonce, 
denoted by r ^ {0, 1}^ is a Fbit random number. We use x S to denote that 
X is chosen uniformly at random from the set S. 

Throughout the paper, we assume that G is a cyclic group of prime order 
q and ^ is a generator of G. For simplicity, we only consider the case where G 
is a subgroup of Z*, the multiplicative group of the integers modulo a prime p. 
However the discussion applies equally well to any group of prime order in which 
the discrete logarithm problem is computationally intractable. We also assume 
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that the domain parameters (g,p,q) are publicly known. This assumption can 
be dropped when the information is exchanged via certificates. An example of a 
certificate which contains the domain parameters will be given in Sect. 4. 

3 A Variant of Interleaving Attacks — Hijacking Attack 

As described in [26], a challenge-response based MAKEP should have two prim- 
itive elements. The first one is two challenge-response pairs which provide the 
mutual authentication for both participating parties. The second one is the ‘bind- 
ing’ of each party’s encrypted secret session key contribution to the correspond- 
ing challenge number sent by its partner. The bindings have to be stringent 
enough to guarantee the freshness of the session key and to counteract various 
attacks. These notions are later formalized in Definition 2. To illustrate their 
importance, we consider a recently proposed protocol [15] for mutual authenti- 
cation and key exchange between a low-power client and a powerful server. The 
protocol is shown in Fig. 1, where Hq, Hi and H 2 denote some cryptographic 
hash functions. 



A — low-power client B - powerful server 

{a, 5 “) {y, f) 

Precomputation 

X, t Gfi ^g\{0] 

X = g-;K = {gyr 

a = Ho{g\ X, K) 

R = H2(g'‘, X, K) 

r = Hi{g*) ^ j, 

K = X^ 
R’ = H2(a'‘, X, K) 

^ ^ 0 < e < 2^ 

R' = R '' 

d — t — ea mod q ^ 

" r = Hiig^igT) 

a = Ho(g^, X, K) 



Fig. 1. Jakobsson-Pointcheval MAKEP 



In the protocol, A is a low-power client and B is a powerful server. Each of them 
has a public key pair. A’s private key is a which is chosen randomly in Z* and 
is her public key. It is assumed that A knows B’s authentic public key, and 
vice versa. Before the protocol begins, A precomputes a session key a from K 
where AT is a secret to be shared with B using the Diffie-Hellman key exchange 
technique [9]. (This is done through X in the first message from A to B in the 
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protocol.) She also precomputes the expected response of B which is denoted 
as R and a value r which is intended to be used for client authentication. Now 
when A runs the protocol with B, A authenticates B (server authentication) by 
checking if the incoming message (second message of the protocol) contains a 
value denoted by R' which is equal to R. We note that B’s response R' ‘binds’ the 
challenge number A of A to his computed secret K. For client authentication, B 
chooses an independent random number e and sends it to A. A then computes a 
value d and sends it back to B. After B receives it, he then computes 
and compares it with the value of r received from A. We note that B’s challenge 
number e has never been bounded to any secret value or any previous messages 
of the current session. This oversight creates a vulnerability for the protocol. 

We now show that this protocol is vulnerable to an attack pictured in Fig. 2 
where E denotes an adversary, SI and S2 denote two parallel sessions to be 
established with B. 



(a, S“) 




E 






R', 



d' = t — e'a mod q 






d' 
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X, r 
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K' = X'v 
R" = H2{g'>,X',K') 
0 < e' < 2*^ 




d' 



(§) 



a' = Ho{g\X',K') 



Fig. 2. An attack on Jakobsson-Pointcheval MAKEP 



In this attack, we assume that E is an active adversary who is ‘sitting’ in be- 
tween A and B and can intercept and inject messages. We also assume that B 
accepts multiple session connections from different instances of a single party 
simultaneously. In the figure, each message is associated with a session denoted 
by a circle containing the session number. For example, the message < X, r > 
from A to E indicates that the message is for session SI. Session SI begins when 
A sends the first message < A, r > to B but it is also eavesdropped by E. E 
immediately creates a new message which contains an integer X' € G and r. E 
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then sends this message to B alleging that it comes from another instance of A. 
B thinks that A wants to establish two sessions, denoted by SI and S2, with him 
at the same time. He then sends back two messages, one for session SI and the 
other for session S2. After receiving the two messages from B, E constructs a 
message which contains the correct response R' for session SI and the challenge 
number e' of B for session S2. This message is sent to A by E alleging that 
it comes from B. Since A thinks that she is establishing only one session with 
B, she believes that the message must be the response of B for session SI. A 
then verifies the response and computes an outgoing message d! . However this 
message is relayed to B as the response of A for session S2. Meanwhile session 
SI remains incomplete and will finally be terminated by B after a timeout. 

Through this attack, the client authentication is compromised. A believes 
that she has established a secure session SI with B sharing a secret key K = , 

while B believes that he has established a different secure session S2 with A 
sharing a different secret key K' = X'^ . Furthermore E can choose randomly a 
value x' G Zq\{0} and computes the pair (A', K') as and respectively. 

Hence she can decrypt all the encrypted messages sent from B to A. 

One possible patch to the authentication flaw is to modify the computation 
of d and the verification process to the following: 

d = t — h{e, K)a mod q 

r = 

where his & cryptographic hash function. Another possible solution is to change 
the computations of R and R' to H 2 {g^ , X, K,e). The tradeoff is that A can 
no longer precompute R and this may affect the performance of the protocol. 
We call this attack a ‘hijacking attack’ because an adversary E can hijack A’s 
conversations with B and impersonate her in a new run of the protocol. In fact, 
hijacking attack can be seen as a variant of interleaving attacks [5,10,19]. 

To ensure our protocol is secure against such an attack, we adopt the commu- 
nication model where the server accepts multiple sessions from the same client 
in parallel and there is an active adversary. Details of the model will be given in 
Sect. 5.1. Furthermore, this attack highlights the importance of the authenticity 
of protocol flow, a notion formalized in [3] as Matching Conversations, which 
will be summarized in Sect. 5.2. 



4 The Protocol 

In this section, we propose a new scheme which is designed for mutually authen- 
ticated key exchange between a low-power client and a powerful server and we 
call it the Client-Server MAKEP. In this scheme, each party has a long-lived 
public key pair. For the powerful server B, we use SKb and PKb to denote 
the corresponding private and public keys respectively. Our description of the 
protocol can apply to any public key cryptographic algorithm but in practice, 
one that requires less memory and can do efficient encryption is preferred since 
the protocol requires the client to do one public- key encryption. We assume 
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that the public key of the server is publicly known. For the low-power client A, 
a Gr Z,\{0} is the private key and g°‘ is the public key. A also has a certificate 
obtained from the Trusted Authority (TA). The certificate is given by 

CertA =<IDa, g, p, q, SigAA{^^A,9,P,q,9°‘) > 

As described in Sect. 2, the domain parameters (9,p,q) can be removed from 
the certificate if they are publicly known. The protocol is illustrated in Fig. 3 
and is described as follows. 

Client-Server MAKEP 

1. A selects ta ^ {0,1}*, b Gr Zg\{0} and computes x = £pkb{i"a) and 

2. A sends CertA, P and a; to B. 

3. B checks CertA by running KerxA and verifies that 1 < P < p and P‘^ = 
1 (mod p). If any check fails, B terminates the protocol run with failure. 
Otherwise, B decrypts x and obtains xa- 

4. B selects rp ^ {0,1}*, computes Sr^{rB, ID^) and sends it to A. B also 
computes a = xa® xr and destroys xa, xr from his memory. 

5. A decrypts the incoming message under xa and checks if the decrypted 
message contains a proper coding of IDs with some number. If the check 
fails, A terminates the protocol run with failure. Otherwise, A denotes the 
number as xr. 

6. A computes a = xa ® xr and y = ah{a) + b (mod q) where h : {0, 1}* ^ 
Zg\{0} is a cryptographic hash function. Then A sends y to B. A also com- 
putes K = H (a) as the session key and accepts the connection. She destroys 
Xa, Xr and a from her memory. 

7. B verifies if g^ = (mod p). If it is false, B terminates the protocol 

run with failure. Otherwise, B computes K = H{a) as the session key and 
accepts the connection. He also destroys cr from his memory. 

In step 3, B verifies that 1 < P <p and /3^ = 1 (mod p). This process is called 
public key validation [7]. It is a very important security measure in practice 
for protecting the system from several subtle attacks such as small subgroup 
attacks [18,17] and identity element attack [7]. H : {0,1}* ^ {0,1}* is a hash 
function instantiating a public random oracle [2]. It is also called a key deriva- 
tion function [14] here because it is used to derive the session key K from the 
shared secret cr. One reason for doing this is to destroy the algebraic relation- 
ships between the session key K and the nonces {xa, xr). Another reason is to 
mix together strong bits and potential weak bits of a where weak bits are certain 
bits of information about a that can be correctly predicted with non-negligible 
advantage. 

5 Security Analysis 

To prove that the protocol described above is secure, we use a variant of the 
Bellare and Rogaway’s model [3,4]. The approach we take closely follows the 
approach of [6]. 
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Fig. 3. Client-Server MAKEP 



A distributed system in our model has a set Ic of clients and a set Is of 
servers. Ic (or 1$) is a set of identities which defines the clients (or servers) who 
can participate in the protocol. The cardinalities of these two sets may be any 
two polynomial functions of a security parameter, k. The adversary E is not in 
Ic or Is- 



Definition 1. A C-S MAKEP is a triple P = {II,E,LL) of probabilistic poly- 
nomial-time computable functions (each with respect to its first argument). II 
specifies how a honest client behaves; E specifies how a honest server behaves; 
and LL specifies the initial distribution of clients’ and servers’ long-lived keys. 
The domain and range of these functions are briefly described as follows. 

(to, S, cr) = n{l^, A, B, SKa, PKa, PKb, conv, r) where 

G N — the security parameter; 

A C Ic — client’s identity; 

B G Is — (intended) server’s identity; 

SKa G {0, 1}* — long-lived secret key of A. 

PKa G {0, 1}* U {*} — long-lived public key of A; (Note that the value 
refers to that the client does not have a long-lived public key. 

This is the case when the client is using a symmetric encryption 
algorithm in the protocol.) 

PKb G {0, 1}* U {*} — long-lived public key of B; 
conv G {0, 1}* — conversation so far; 
r G {0, 1}‘^ — random coin flips; 

TO G {0, 1}* U {*} — the next message sent to B; 
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{A,R,*} — the decision where A represents accept, R represents 
reject and * refers to that no decision has been made so far; 

<y G {0, 1}* U {*} — the private output; it is the session key when it is 
string-valued. 

{m,5,a) = , B, A, SKb, PKb, PKA,conv,r) where the notations are simi- 

larly defined as those for II. 

(SK,PK) = LL{l^,t,r) where t G {client, server} is the type of the entity. 

In our protocol, if the value t of LL is client, SK is denoted by a four-tuple 
{g,p, q, a) and PK is denoted by another four-tuple (g,p, q, h) where p and q are 
prime such that q \ p — 1. The length of p is polynomial in A:, g is an element 
in Z* of order q. a is chosen randomly from Zg\{0| and b = g°' modp. If t is 
server, the value returned by LL will be the public key pair of some asymmetric 
key encryption algorithm chosen by the protocol. 



5.1 Communication Model (Adversarial Model) 

ADVERSARY E. All communication among interacting parties is under 
the control of an adversary. In particular, the adversary can read, inject, modify, 
delete, delay and replay messages in the model. The adversary can also start up 
entirely new “instances” of any of the parties at any time. Hence there could 
be multiple sessions engaged in the system at the same time. This model gives 
the adversary the capability to launch attacks such as reflection and interleaving 
attacks suggested in [20,5,10,19] and the hijacking attack described in Sect. 3. 

Formally, an adversary if is a probabilistic machine which equips with an 
oracle denoted by LLpk and an infinite collection of oracles Ilfj and El, for 
i G Ic, j G Is and s,t G N. Oracle LLpk, which will be described below, models 
the event that public keys of all entities (both clients and servers) are publicly 
known. Oracle iT)} models the instance s of client i attempting to agree on a 
shared session key with server j. Oracle El models the instance t of server j 
attempting to agree on a shared session key with client i. The model uses oracle 
queries to capture E's attacks such that (1) E writes queries on a special tape; 
(2) the corresponding oracle reads the tape automatically; and (3) E gets back 
a response in unit time (i.e. oracle query is treated as one step in an algorithm). 

RUNNING THE PROTOCOL. A generic execution of a protocol between 
a client instance and a server instance is called a run of the protocol. We model 
one run of the protocol as conducting one experiment in the presence of an 
adversary E using a security parameter k. It is described as follows. 

1. Initialization 

Toss coins for LL, E and all oracles II fj and Eh. 

2. Run E 

E may make oracle queries and the queries are answered as described in 

Table 1. 
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Table 1. The queries which E can ask of its oracles 





On query of 


Return 


T 


(SendClient, i, j, s, x) 


7 J, SKi, PKi, P K j, convntj, coins ntj) 
And then set convllfj ^ convllij ■ x ■ m 


2 


(SendServer, j, i, t, x) 


17’"'’ ( 1 *= , J , 7 SKj , PKj , PKi, conv'P-i , coins^'^ ) 
And then set conyPp ^ conv'Pji ■ x ■ m 


3 


(RevealClient, i, j, s) 


,i, j, SKi, PKi, PKj,convIIij, coinsUfj) 


4 


(RevealServer, j, i, t) 


, j,i, SKj, PKj, PKi, convPji, coinspE) 


5 


(CorruptClient, i, SK, PK) 


< SKi, PKi,convnfj,coinsIIij >j^s 
And then set SKt ^ SK- PKi ^ PK 


6 


(CorruptServer, 7, SK, PK) 


< SKj, PKj,conv'I'ji,coins'I'ji >i,t 
And then set SKj ^ SK; PKj ^ PK 


7 


(RequestPublic) 


< PKi, PKj >ij 


8 


(Test, i,j,s) 


Choose at random a bit 9. 

If 0 = 0 return r ^ {0, 1}*'; 
if 0 = 1 return 

77“^ (1^, 7 j, SKi, PKi, PKj,convIIij, coinsUfj). 


9 


Other queries (meaningless) 


A 



When the adversary makes SendClient or SendServer query, oracle Ufj or 
iZdj calculates the answer using the description of function iT or if' respectively. 
Hence E gets the message m, and the decision A/R/* which means that she 
can “see” when an oracle accepts. When the adversary makes RevealClient 
or RevealServer query, she gets back the private output of the corresponding 
oracle. The most severe type of loss for a player is when the player’s complete 
internal state becomes known to the adversary. To model this possibility, we allow 
CorruptClient and CorruptServer queries, from which E learns the internal 
state of a player, including the private key SKi, and substitutes some new value 
SK for the player’s long-lived key. From that point on, we assume that all 
other players will use the revised long-lived keys of the corrupted players. When 
E writes (RequestPublic) on its query tape, we think of that query as being 
answered by the oracle LLpx- The return is simply all the public keys of both 
clients and servers. 

We define that 77? is 

‘'J 

1. accepted if i,j, SKi,PKi, PKj, convllf^, coinsllf^) = A; 

2. opened if there has been a RevealClient query; 

3. unopened if it is not opened; 

4. corrupted if there has been a CorruptClient query; and 

5. uncorrupted if it is not corrupted. 

Similar notations apply to the server instances. 

BENIGN ADVERSARIES. We use the term, benign adversary, to model 
a reliable channel. This is used to show that a protocol is ‘well-defined’ which 
means that the protocol provides the two communicating parties (two oracles) 
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the same session key at the end of a protocol run (when the oracles have ac- 
cepted). For every G /c x /g x N x N, there exists an (i, j, s, t)-benign 

adversary which is deterministic and always performs a single run of the protocol 
between Ufj and by faithfully relaying flows between these two oracles. 

5.2 Definition of Security 

With such a powerful adversary described above, a C-S MAKEP is considered 
secure if: 

1. The protocol provides an instance of a client 77®^ with some assurance that 
she is involved in a real-time communication with an instance of a server 

and vice versa. 

2. No adversary can learn anything about a session key which is held by an 
uncorrupted, unopened but accepted client instance 7T®^ (or respective server 
instance tfdj) and the corresponding server instance (or respective client 
instance 77®^) is uncorrupted and unopened but not necessarily accepted. 

Before we give a formal definition of security, we need two more tools described 
as follows. 

MATCHING CONVERSATIONS. Matching conversations [3] provide 
the necessary formalism to define the assurance provided to one player that 
she has been involved in a real-time communication with another player. The 
formal definition of matching conversations is given in Appendix A. Here, we de- 
scribe the notion of No-Matching^(fc). This notation is used to specify the event 
that when protocol P is run against adversary E, there exists an uncorrupted 
oracle which has accepted but there is no other oracle which has engaged in a 
matching conversation with this oracle. 

PROTECTING FRESH SESSION KEYS. The notion that no adversary 
can learn information about fresh session keys [3,4] is formalized by using the 
polynomial indistinguishability approach. Specifically at the end of an experi- 
ment, the adversary should not be able to gain more than a negligible advantage 
on distinguishing the actual fresh session key from a random number sampled 
from {0, 1}*. This idea is formalized by using the type 8 query shown in Table 1. 
We make the following modifications to the experiment. 

— The Test query must be the adversary’s last meaningful query, and it must 
be asked of a fresh oracle. 

— To answer the query, the oracle flips a fair coin 9 ^ {0, 1}. If 0 = 0, the 
oracle returns a key sampled at random from {0, 1}^. If 0 = 1, it returns the 
session key. 

— The adversary’s job is to guess 6 and outputs a bit Guess. 

Let Good-Guess^(fc) be the event that Guess = 9. Then we define 
advantage^ (k) = 2 • Pr[Good-Guess^(7)] — I. 

Now we give a formal definition of security which is modified from [6] : 
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Definition 2. A C-S MAKEP P = {II,E,LL) is secure if: 

1. (Well-Defined Protocol) In the presence of the benign adversary on Ilfj 
and both oracles always accept holding the same session key, which is 
uniformly distributed over {0, 1}^; 

and for any adversary E: 

2. (Real-time Partner) If two uncorrupted oracles have matching conversa- 
tions, then both oracles accept and hold the same session key; 

3. (Authenticity of Plow) The probability of No-Matching^ (k) is negligible; 
4- (Protecting Presh Session Key) advantage^ {k) is negligible. 

Here we use the conventional definition of the negligible function, that is, a real- 
valued function e(fc) is negligible if for every c > 0 there exists a fee > 0 such 
that e{k) < k~^ for all k > kc- 

Before we can show that our scheme meets the conditions of Definition 2, we 
need to specify the cryptographic primitives on which the security of our scheme 
relies. The primitives used in Client-Server MAKEP are public key encryption 
scheme, symmetric key encryption scheme and discrete logarithm problem. Their 
security definitions are derived from [12,13] and are given in Appendix A. 

Theorem 1. The Client-Server MAKEP (described in Sect. 4) is a secure C-S 
MAKEP provided that the discrete logarithm problem is intractable, a secure 
symmetric key encryption scheme exists and a secure public key encryption 
scheme exists. 

The proof of this theorem appears in Appendix B. 

6 Performance 

In Sect. 4, we present the protocol without specifying any particular public key 
cryptographic algorithm for both TA and the server B. The choice of which 
is up to the target applications by taking into consideration of their systems’ 
capabilities and constraints. The performance evaluation given here is therefore 
based on the number of times the cryptographic operations have to be performed, 
the sizes of the messages, the total number of messages sent in each protocol run 
and the memory requirement. We also restrict our attention to the efficiency 
of the client side only. Throughout this section, we also use the measurement 
results given in [27] to estimate the speed of the protocol running on a 16MHz 
Palm V with Palm OS version 3.3. 

Speed. The client is required to compute one public key encryption, one sym- 
metric key decryption, one modular exponentiation, one modular multiplication, 
one modular addition, two hashes and two random number generations. On the 
average, SHA-1 only takes 0.9 msec to digest a 128-bit binary string on the 
Palm V. Therefore we can ignore the time taken for hashing in our evaluation. 
In practice, hash functions are also used to generate pseudo-random numbers. 
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Thus their generation speed is comparable to that of hashing and can be ig- 
nored. For symmetric key decryption, both SSC2 [28] and ARC4 (Alleged RC4) 
only take about one millisecond each to decrypt a 256-bit ciphertext. Even for 
a block cipher, like Rijndael [8], it takes less than three milliseconds. The public 
key encryption is also doable if we choose a public-key cryptographic algorithm 
with very efficient encryption process. For example, it takes 710 msec to do a 
512-bit RSA encryption when the value of the public exponent is three. If the 
Rabin cryptosystem is used, the encryption process will comprise one modular 
addition and one modular multiplication. By ignoring the overhead of doing any 
appropriate encoding of the plaintext, it takes only 110 msec to perform a 512-bit 
encryption. 

We notice that in step 1 of the protocol described in Sect. 4, all the parameters 
can be prepared offline as precomputation. Hence only the following operations 
have to be done by the client during the runtime of the protocol: 

1. one symmetric key decryption 

2. one modular multiplication 

3. one modular addition 

We find that a 160-bit modular addition and multiplication can be done in 0.29 
msec and 15 msec respectively on a 16MHz Palm V. Thus if k is 128, the length 
of IDs is 128 bits and the length of q is 160 bits, the time taken to do the 
computation is less than 20 msec. 

Network and Storage Efficiency. There are only three messages exchanged 
in a single run of the protocol. If 1024-bit RSA is chosen to be the public key 
cryptographic algorithm for the TA, 512-bit RSA for the server, the length of p 
is 512 bits and the domain parameters are publicly known, the length of Cert a 
would be 208 bytes. The sizes of the three messages would be 336 bytes, 32 
bytes and 20 bytes respectively. Therefore, this scheme is also very suitable 
for wireless communications. For storage, the client needs to store a, CertA, 
PKta, PKb, IDs, ta, X, b, (3 and (g,p,q) if precomputation is applied. The 
total memory requirement for storing these parameters is 940 bytes. The actual 
memory requirement depends on the specific cryptographic algorithms and their 
parameters set in each target application. We notice that much less memory is 
required if G is the group of points of an elliptic curve over a finite field and an 
elliptic curve cryptosystem is used by both the TA and the server. 



7 Concluding Remarks 

Forward Secrecy. It is clear that if the server’s private key is compromised, then 
all the session keys from the earlier runs can be recovered from the transcripts. 
However, the corruption of the client may not help to reveal the session keys. 
Hence our scheme provides half forward secrecy [7]. Since the client may be a 
weak device while the server can be a strong and secure entity which support 
much stronger security measures than the client, we believe that forward secrecy 
on the client side is a much more important feature than that on the server side. 
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On the other side, consider that all previous sessions have been compromised, 
that is, the adversary E knows {(Ji}o<i<n where at is the “algebraic” session key 
of the i-th session, E may not be able to reveal the client’s private key, a, because 
of the unknown random values of b in these sessions. In brief, since ta and rg 
are nonces and no single party can control the values of the session keys, we 
assume that ai yf aj for i yf j. Hence if h is a cryptographic hash function, the 
probability that h{ai) = h{aj) is negligible. Similarly, since b is randomly chosen 
from Zq\{0}, we assume that bi yf bj for i yf j. If E wants to obtain a, E needs 
to solve the following equations: 



Ui = ah{ui) + bi (mod q) for Q < i < n 

with unknowns a and {6i}o<i<n. It can be seen that there are {q — I) sets 
of possible solutions. Hence it is no easier than solving the discrete logarithm 
problem. 

Similarly, to know the random values of b for a couple of sessions alone may 
not be enough to reveal the client’s private key either. It is required that both b 
and h{a) of a particular session are known in order to reveal a. Alternatively, two 
sessions with the same value of b and having the values of h{a) known are also 
being able to compute the client’s private key, namely a = {y\ — y 2 )/{h{(Ji) — 
h{<j 2 )) (mod q). 

Precomputation. As we mentioned earlier, the scheme benefits from the precom- 
putation technique to significantly reduce the computational requirement during 
the runtime of the protocol. In most of the applications, the client can conduct 
the precomputation during idle time. This technique also helps to reduce the 
peak power consumption by averaging out most of the computations over time. 

Scalability. The protocols proposed in [15,26] are either server-specific or limited 
by the memory capacity and the precomputation overhead. Server-specific means 
that the client needs to pre-determine the server she wants to communicate and 
has to do some precomputations offline. In this scheme, the precomputation 
is optional and also applications have the flexibility to choose the extent of 
precomputation. For example, the client can pre-select the values of ta and b 
and precompute /3. During runtime, she only needs to compute x using a public 
key encryption before sending out the first message. As mentioned in Sect. 6, 
it can still be done efficiently if we choose a public-key cryptographic algorithm 
with very efficient encryption process. In this way, the values precomputed by 
the client do not depend on any specific server. Therefore it gives full scalability 
that other public-key MAKEPs provide but with higher efficiency and fewer 
messages. 

Hash function h. The hash function h specified in the Client-Server MAKEP 
is to output an integer in Zg\{0}. As the client authentication is essentially 
the same as Schnorr’s identification scheme [23,24], the requirement of h can be 
loosed to output a binary string in {0, • • • , 2* — 1} where 2“* governs the success 
rate of an adversary to launch a crooked proof attack, which is described in the 
proof of Claim 3. 
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A Definitions 

For matching conversations, we use the same definition as given in [3]. Without 
loss of generality, we may assume the number of flows R in the protocol to be 
ood. Let E be an adversary. For any oracle or tfdj, its conversation can be 
captured by a sequence: 

C — (ti , , ( 3 \ ) , (t 2 , 0:2 , [^2) 5 * * * , (fm 5 Otra , Pm) ■ 

This sequence encodes that at time ti the oracle was asked a\ and responded 
with Pi] at time T 2 > D, the oracle was asked «2 and answered P 2 ', finally, at 
time Tm, it was asked Um and answered Pm- At time Tm, adversary E terminates 
without asking any more queries. If oracle Ilfj (or EjP) has ai = A, it is called 
an initiator oracle] otherwise it is called a responder oracle. 

Definition 3 ([3]). Let P he a R-flow protocol, where R = 2p—l is the number 
of flows. Run P in the presence of an adversary E and consider two oracles, an 
initiator oracle, and a responder oracle, that engage in conversations C and C 
respectively. 

1. C is said to be a matching conversation to C if there exist Tq < Ti < ■ ■ ■ < 
T/j_i and ai,Pi, • • • , Pp-i, oip such that C is prefixed by: 

( tq , a , ai), ( t 2 , /? 1 , 02 ), • • • , ( t 2 p - 2 , Pp-ljCtp) 

and C is prefixed by: 
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(ti, ai, /?i), (t3, a2,P2), ■■ ■ , (t2p- 3, 0!p-i, (3p-i). 

2. C is said to be a matching conversation to C if there exist Tq < Ti < ■ • ■ < tr 
and ai,/3i, • • • ,f3p-i,ap such that C is prefixed by: 

(ri,ai,/3i), (t 3, Q;2)/32)5 • • • j (t"2p-3) «p-1j / 3p-l); {T2p-l,Ctp,*) 

and C is prefixed by: 

{tq, a, Oi), (t2, /3i, «2), • • • , {T2p-2,Pp-l,Oip). 

If C is a matching conversation to C and C is a matching conversation to C , 
then the two oracles are said to have had matching conversations. 

The following definitions are derived from [12,13]. They are just briefly intro- 
duced here. 

Definition 4. A public key encryption scheme is a triple, {Q, £',!)'), of proba- 
bilistic polynomial-time algorithms satisfying the following conditions: 

1. key generation algorithm : {e,d) <— tj(l^) where k is the security parameter, 
e is the public key, and d is the corresponding private key. 

2. encryption algorithm : c ^ 5'(l^,m) where m ^ {0,1}* is the message, 
c G {0, 1}* is the ciphertext and I is polynomial in k. 

3. decryption algorithm : m ^ , c). 

Definition 5. A public key encryption scheme {Q,£' ,V) is secure (polynomial 
time indistinguishable) if for every PPT algorithm E and for every polynomial 
Q, for all sufficiently large k, 

Pr[E(l*',e,mo,mi,c) = m | (e, d) ^ 5(1*'); mo^{0,l}*'; mi ^ {0, 1}*'; 

m<— {mo,mi}; c<— ^'(m)] 

1 1 
^ 2 Q{k) 

B Proof of Theorem 1 

Proof, (sketch) We prove the security of the protocol by establishing each 
condition of Definition 2. 

Condition 1 and 2: The first two conditions follow immediately from the 

description of the Client-Server MAKEP and the assumption that is a hash 
function instantiating a random oracle [2] . 

Condition 3: We prove this by contradiction. Assume that E is an arbitrary 

adversary and that Pr[No-Matching^(/c)j is non-negligible, then we show that 
certain cryptographic primitives which have assumed to be secure would be 
broken. We divide the proof into several subsections. 

SERVER AUTHENTICATION 

Consider the first two messages of the protocol shown in Fig. 3. 
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Claim 1 If there exists a secure pair of symmetric key encryption scheme and 
public key encryption scheme, then upon receiving x = £pkb{‘’’a), only B can 
compute the second message, Sr^^irEjIDs); that is, for every PPT algorithm E, 
there is a negligible function e{k) such that for sufficiently large k, 

Pr[E{l\x,lDB,PKB)=£rArB^B) \ va ^ x ^ EpK^irA)] < e{k) 

where r is some random number of length k. 

This can be shown by contradiction. Suppose that on inputs 1*, SpkbAa), IDs 
and PKp, E computes without asking an oracle of B (i.e. without 

knowing SKp) where r is some random number of length k. We can construct 
a machine C to break the public key encryption scheme. The following is only a 
sketch of the complete proof. Formally we should simulate an adversary’s point 
of view completely like the proofs given in [3,4,6]. 

C = “On inputs 1^, PKp, toq, toi and c = Spkb{'oo) where m ^ {mo, toi} : 

1. We simulate if’s view and answer all the oracle queries involved. 

2. For an initiator oracle, set x = c (i.e. = m) and denote the identity of 

the corresponding responder as ID^ . 

3. For the second query of the initiator oracle (here we think the query contains 

(e IDs)) , decrypt the query under mo and check if the decrypted message 
contains a proper coding of IDs with some number. 

4. If it is true, output mo; otherwise decrypt the query under mi and check the 
validity of the decrypted message again. 

5. If it passes, output mi; otherwise give up. 

Claim 2 The first two messages of Client-Server MAKEP are sent in the correct 
order. 

Let the adversary E makes Q{k) oracle calls. If we assume that B produces 
'SrA(^B,IDB) before A sends out SpkbAa), then the probability that B is queried 
with the correct value of £pkb{'<’a) is at most Q{k) ■ 2~^ which is negligible. 

CLIENT AUTHENTICATION 

Claim 3 Assuming the discrete logarithm problem is intractable, then only A 
can compute the correct pair (f), y) such that g^ = (mod p). 

The authentication mechanism is similar to Schnorr’s identification scheme [23, 
24]. Its security is based on the intractability of the discrete logarithm problem 
where the problem instance is log^ g°'. To see the forgery probability of the client 
authentication, we consider an adversary E who impersonates A by choosing 
some b, guessing the correct value of h{a) (which may be obtained from the 
guessed value of rp or a instead) and sending (3 = and then y = bto 

B. This is called crooked proof attack. The probability of success for this attack is 
1 /E where E = min(|7f |, 2^) and Tt denotes the range of h. In the original papers, 
Schnorr showed that this success rate cannot be increased unless computing 
the discrete logarithm is easy. Detailed security analysis can be referred to the 
original papers as well as [25]. 
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Claim 4 A cannot send y out before B sends 
The proof is similar to that for Claim 2. 

Condition 4: As all the messages are generated by the intended parties in the 

right order in a single run of the protocol, it is obvious that an adversary cannot 
obtain any information of rA or rs from the messages provided that there exists 
a secure public key encryption scheme and a secure symmetric key encryption 
scheme. □ 
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Abstract. Dynamic group DifBe-Hellman protocols for Authenticated 
Key Exchange (AKE) are designed to work in a scenario in which the 
group membership is not known in advance but where parties may join 
and may also leave the multicast group at any given time. While several 
schemes have been proposed to deal with this scenario no formal treat- 
ment for this cryptographic problem has ever been suggested. In this 
paper, we define a security model for this problem and use it to precisely 
define Authenticated Key Exchange (AKE) with “implicit” authentica- 
tion as the fundamental goal, and the entity-authentication goal as well. 
We then define in this model the execution of a protocol modified from 
a dynamic group Diflie-Hellman scheme offered in the litterature and 
prove its security. 



1 Introduction 

1.1 The Group DifRe-Hellman Key Exchange 

Group Diflie-Hellman schemes for Authenticated Key Exchange are designed 
to provide a pool of players communicating over a public network and holding 
long-lived secrets with a session key to be used to achieve multicast message 
confidentiality or multicast data integrity. In this paper, we consider the scenario 
in which the group membership is not known in advance - dynamic rather than 
static - where parties may join and leave the multicast group at any given time. 

After the initialization phase, and throughout the lifetime of the multicast 
group, the parties need to be able to engage in a conversation after each change 
in the membership at the end of which the session key is updated to be sk' . 
The secret value sk' is only known to the party in the multicast group during 
the period when sk' is the session key. The adversary may generate repeated 
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Sciences Division, of the U.S. Department of Energy under Contract No. DE-AC03- 
76SF00098. This document is report LBNL-48202. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 290-309, 2001. 
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and arbitrarily ordered changes in the membership for subsets of parties of his 
choice. 

The above scenario is a distributed application in which up to one hundred 
parties work together in order to get a task done where many of the parties 
may be sending data to the multicast group [13]. Examples of such applications 
include replicated server [22], audio- video conferencing [21] and collaborative 
tools [2]. 

Several papers [3,19,20,29] have addressed this scenario and one of its incar- 
nations is the system offered in [1]. However these protocols, and this existing 
system, are based on or use an informal approach and do not rely on proofs of 
security. These approaches are several years later often found to be flawed and, 
indeed, weaknesses have already been discovered for some protocols [24]. One 
way to improve the security of the protocols is to complete formal proofs and 
thus avoid many of the weaknesses. 



1.2 The Security Notions 

In the paradigm of provable security [25] one identifies a concrete cryptographic 
problem to solve (like the group Diffie-Hellman key exchange) and defines a for- 
mal model for this problem. The model captures the capabilities of the adversary 
and the capabilities of the players. Within this model one defines security goals 
to capture what it means for a group Diffie-Hellman scheme to be secure. And, 
for a particular scheme one exhibits a proof of its security. The security proof 
aims to show that the scheme actually achieves the claimed security goals under 
computational assumptions. 

The fundamental security goal for a group Diffie-Hellman scheme to achieve is 
Authenticated Key Exchange (with “implicit” authentication) identified as AKE. 
In AKE, each player is assured that no other player aside from the arbitrary pool 
of players can learn the session key. Another stronger highly desirable goal for 
a group Diffie-Hellman scheme to provide is Mutual Authentication (MA). In 
MA, each player is assured that only its partners actually have possession of the 
distributed session key. 

With these security goals in hand the security of a group Diffie-Hellman 
scheme can be analyzed in the standard model or in an idealized model of com- 
putation (ideal-hash model [7,14], ideal-cipher model [5], generic model [27]). 
Previous security analyses in the ideal-hash model, the so-called random-oracle 
model [7,14] wherein the cryptographic hash functions (like SHA or MD5) are 
viewed as random functions, provide satisfactorily convincing guarantees of se- 
curity for numerous cryptographic schemes [8,15,26] although not at the same 
level as those in the standard model. 

1.3 Contributions 

This paper provides major contributions to the solution of the group Diffie- 
Hellman key exchange problem. We present the first formal model to help man- 
age the complexity of definitions and proofs for the authenticated group Diffie- 
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Heilman key exchange when the group membership is dynamic. This model is 
equipped with some notions of dynamicity in the group membership where the 
various types of attacks are modeled by queries to the players. This model does 
not yet encompass attacks involving multiple player’s instances activated con- 
currently and simultaneously by the adversary. Also, in order to be correctly 
formalized, the intuition behind mutual authentication requires cumbersome 
definitions of session IDS and partner IDS which may be skipped at the first 
reading. 

We start with the model and definitions introduced in [11] and extend them 
to deal with the authenticated dynamic group Diffie-Hellman key exchange. We 
define the partnering, freshness of session key and measures of security for AKE. 
In this model we define the execution of a protocol, we refer to it as AKEl, 
modified from [3] and show that it can be proven secure under reasonable and 
well-defined intractability assumptions. 

Our paper is organized as follows. In the remainder of this section we sum- 
marize the related work. In Section 2 we define our security model. We use it in 
Section 3 to define the security definitions that should be satisfied by a group 
Diffie-Hellman scheme. We present the AKEl protocol in Section 4 and justify 
its security in the random oracle model. Finally in Section 5 we briefly deal with 
MA in the random oracle model. 

1.4 Related Work 

Many group Diffie-Hellman protocols [3,4,12,16,18,28,30,31] aim to distribute 
a session key among the multicast group members for a scenario in which the 
membership is static and known in advance. However these protocols are not 
well-suited for a scenario in which members join and leave the multicast group 
at a relatively high rate. Fortunately, these protocols can be extended to address 
this latter scenario and several papers [3,19,20,29] have shown how to do so. The 
protocol presented in [3] has been found to be flawed in [24] and the other papers 
assume authenticated links, or more specifially do not consider the AKE and MA 
goals as part of the protocols. These goals need to be addressed separately. 

A first step has already been taken toward a formal treatment of the authen- 
ticated Diffie-Hellman key exchange problem in the multi-party setting. Indeed, 
we presented in [11] the first formal model for this problem for a scenario in 
which the membership is static. The model was derived from Bellare et al.’s 
model of distributed computing [5,17]. Addressed in detail were the AKE and 
MA goals. For each we presented a definition, a protocol and a proof that the 
protocol achieves these goals. 

2 The Model 

In this section we formalize the group Diffie-Hellman key exchange and the 
adversary’s capabilities. In our formalization, the players do not deviate from 
the protocol, the adversary is not a player and the adversary’s capabilities are 
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modeled by various queries. These queries provide the adversary a capability 
to initialize a multicast group via Setup-queries, add players to the multicast 
group via Join-queries, and remove players from the multicast group via Remove- 
queries. 



2.1 Protocol Participants 

We fix a nonempty set lA of players that can participate in a group Diffie-Hellman 
key exchange protocol P. The number n of players is polynomial in the security 
parameter k. Also, when we mean a specific player of U we use Ui while when 
we mean a not fixed member of hi we use U without any index. 

We also consider a nonempty subset of U which we call the multicast group 
X. And in X a player Ucc, the so-called “group controller” , initiates the addition 
of players to X or the removal of players from X. Ucc is trusted to do only this. 



2.2 Long-Lived Keys 

Each player U G U holds a long-lived key LL[/ which is either a pair of matching 
public/private keys or a symmetric key. Associated to protocol P is a LL-key 
generator Qll which at initialization generates LL[/ and assigns it to U. 



2.3 Generic Group DifRe-Hellman Schemes 

A group Diffie-Hellman scheme P for U is defined by four algorithms: (the session 
key SK is known by any player in X but unknown to any player not in X.) 

— the key generation algorithm Qll which has an input of 1^, where k is the 
security parameter, provides each player in U with a long-lived key LLjj. 
Gll a probabilistic algorithm. 

— the setup algorithm which has an input of a set of players J , sets variable X 
to be J and provides each player P in I with a session key SKj/. The setup 
algorithm is an interactive multi-party protocol between some players of lA. 

— the remove algorithm which has an input of a set of players J , updates 
variable X to be X\J (the set of all players in X that are not in J) and 
provides each player U in this updated set with an updated session key 
SK[/. The remove algorithm is an interactive multi-party protocol between 
some players of U. 

— the join algorithm which has an input of a set of players J ^ updates variable 
X to be XU and provides each player U in this updated set with an updated 
session key SK^. The join algorithm is an interactive multi-party protocol 
between some players of U. 

An execution of P consists of running the key generation algorithm once, and 
then many times the setup, remove and join algorithms. We will also use the 
term operation to mean one of the algorithms: setup, remove or join. 
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Session IDS. We define the session IDS (SIDS) for player Ui in an execution of 
protocol P as SIDS([/i) = {SID^ : j e ID} where SID^ is the concatenation of 
all flows that Ui exchanges with player Uj in executing an operation. Therefore, 
Ui setsSKf/. to 0 and SIDS(C/i) and 0 before executing an operation. (SIDS is 
publicly available.) 



Accepting and Terminating. A player U accepts when it has enough in- 
formation to compute a session key SK[/. At any time a player U who is in 
“expecting state” can accept and it accepts at most once in executing an opera- 
tion. As soon as U accepts in executing an operation, SK and SIDS are defined. 
Now once having accepted U has not yet terminated this execution. Player U 
may want to get confirmation that its partners in this execution have actually 
computed SK or that they are really the ones it wants to share a session key 
with. As soon as U gets this confirmation message, it terminates the execution of 
this operation - it will not send out any more messages and remains in a “stand 
by” state until the next operation. 



2.4 Security Model 

Queries. The adversary A interacts with the players U by making various 
queries. There are seven types of queries. The Setup, Join and Remove queries 
may at first seem useless since, using Send queries, the adversary already has 
the ability to initiate a setup, a remove or a join operation. Yet these queries 
are essential for properly dealing with the dynamic case. To deal with sequential 
membership changes, these three queries are only available if all the players in U 
have terminated. We now explain the capability that each kind of query captures. 

— Setup(i/): This query models adversary A initiating the setup operation. The 
query is only available to adversary A if all the players in U have terminated 
and are thus in a “stand by” state.. A gets back from the first player U in J 
the flow initiating the setup execution. Other players are aware of the setup 
and move to an “expecting state” but do not reply any message. 

— Remove(j7): This query models adversary A initiating the remove operation. 
The query is only available to adversary A if all the players in U have ter- 
minated. A gets back from the group controller Uqc the flow initiating the 
remove execution. Other players are aware of the remove operation but do 
not reply. They move from a “stand by” state to an “expecting state” . 

— Join(j7): This query models adversary A initiating the join operation. The 
query is only available to adversary A if all the players in U have terminated. 
A gets back from the group controller Uqc the flow initiating the join exe- 
cution. Other players are aware of the join operation but do not reply. They 
move from a “stand by” state to an “expecting state” . 

— Send([/, m): This query models adversary A sending a message to a player. 
The adversary A gets back from his query the response which player U would 
have generated in processing message m (this could be the empty string if the 
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message is uncorrect or unexpected) . If player U has not yet terminated and 
the execution of protocol P leads to accepting, variable SIDS(C/) is updated 
as explained above. 

— Reveal([/): This query models the attacks resulting in the misuse of the 
session key, which may then be revealed. The query is only available to ad- 
versary A if player U has accepted. The Reveal-query unconditionally forces 
player U to release SK[/ which is otherwise hidden to the adversary. 

— Corrupt([/): This query models the attacks resulting in the player C/’s LL- 
key been revealed. A gets back LL;/ but does not get any internal data of U 
executing P. 

— Test(C/): This query models the semantic security of the session key SK, 
namely the following game Game“^®(Al, P) between adversary A and the 
players U involved in an execution of the protocol P. The Test-query is only 
available if U is Fresh (see Section 3) . In the game A asks any of the above 
queries however it can only ask a Test-query once. Then, one flips a coin b 
and returns sku if & = 1 or a random string if 6 = 0. At the end of the game, 
adversary A outputs a bit b' and wins the game iib = b' . 



Executing the Game. Choose a protocol P with a session-key space SK, and 
an adversary A. The security definitions take place in the context of making A 
play Game“*®(Al, P). P determines how players behave in response to messages 
from the environment. A sends these messages: she controls all communications 
between players; she can repeatedly initiate in a non-concurrent way but in 
arbitrary order sequential changes in the membership for subsets of players of 
her choice; she can at any time force a player U to divulge SK or more seriously 
LLf/. This game is initialized by providing coin tosses to Gll, A, all U, and 
running to set LLu. Then 

1. Initialize any U with SIDS <— null, PIDS <— null, SK <— null, 

2. Initialize adversary A with 1^ and access to all U, 

3. Run adversary A and answer queries made by A as defined above. 

3 The Definitions 

In this section we present the definitions that should be satisfied by a group 
Diffie-Hellman scheme. We define the partnering from the session IDS and use it 
to define security measurements that an adversary will defeat the security goals. 
We also recall that a function e(fc) is negligible if for every c > 0 there exists a 
kc > 0 such that for all k > kc, s{k) < k~‘^. 

3.1 Partnering Using SIDS 

The partnering captures the intuitive notion that the players with which Ui has 
exchanged messages in executing an operation, are the players with which Ui 
believes it has established a session key. Another simple way to understand the 
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notion of partnering is that Uj is a partner of Ui in the execution of an operation, 
if Uj and Ui have directly exchanged messages or there exists some sequence of 
players that have directly exchanged messages from Uj to Ui. 

In an execution of P, or in Game“^®(^, P), we say that players Ui and Uj 
are directly partnered if both players accept and SIDS(C/i) n SIDS(C/j) yf 0 
holds. We denote the direct partnering as Ui ^ Uj. 

We also say that players Ui and Uj are partnered if both players accept 

and if, in the graph Gsids = where V = {Ui : i = 1, . . . , |X|} and 

E = {(Ui,Uj) : Ui ^ Uj} the following holds: 

> 1, ^ Ui, U 2 ,...,Uk>- with Ui = Ui, Uk = Uj, Ui-i ^ Ui. 

We denote this partnering as Ui Uj. 

We complete in polynomial time (in |y|) the graph Gsids to obtain the 
graph of partnering: Gpjds = where V = V and E' = {{Ui,Uj) : 

U, Uj}, and then define the partner IDS for oracle Ui as: 

PIDS(C/,) = [Uj : Ui ^ Uj} 



3.2 Freshness 

A player U is Fresh, in the current operation execution, (or holds a Fresh SK) 
if the following two conditions are satisfied. First, nobody in U has ever been 
asked for a Corrupt-query from the beginning of the game. Second, in the current 
operation execution, U has accepted and neither U nor its partners PIDS(f7) 
have been asked for a Reveal-query. 

Let’s also recall that forward-secrecy entails that loss of a LL-key does not 
compromise the semantic security of previously-distributed session keys. 

3.3 Security Notions 

AKE Security. In an execution of P, we say an adversary A wins if she asks 
a single Test-query to a Fresh player U and correctly guesses the bit b used in 
the game Game“^®(A, P). We denote the ake advantage as Advp^®(A); the 
advantage is taken over all bit tosses. (The advantage is twice the probability 
that A will defeat the AKE security goal of the protocol minus one^.) Protocol 
P is an A-secure AKE if Advp*®(A) is negligible. 



MA Security. In an execution of P, we say adversary A violates mutual au- 
thentication (MA) if there exists an operation execution wherein a player U 
terminates holding SIDS(f7), PIDS(f7) and |PIDS(f7)| |X| — 1. We denote the 
ma success as Succp“(A) and say protocol P is an A-secure MA if Succp“(A) 
is negligible. 

^ A can trivially defeat AKE with probability 1/2, multiplying by two and substracting 
one rescales the probability. 
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Therefore to deal with mutual authentication, we consider a new game, we 
denote Game™“(Al, P), wherein the adversary exactly plays the same way as 
in the game Game“*®(Al, P) with the same player accesses but with a different 
goal: to violate the mutual authentication. 

Secure Signature Schemes. A signature scheme is defined by the follow- 
ing [26]: 

— Key generation algorithm Q. On input with security parameter k, the 
algorithm Q produces a pair (Kp,Ks) of matching public and secret keys. 
Algorithm Q is probabilistic. 

— Signing algorithm S. Given a message m and (Kp^Kg), S produces a sig- 
nature a. Algorithm S might be probabilistic. 

— Verification algorithm V. Given a signature ct, a message m and Kp, V tests 
whether ct is a valid signature of m with respect to Kg. In general, algorithm 
V is not probabilistic. 

The signature scheme is (t, e)-GMA-secure if there is no adversary A which 
can get a probability greater than e in mounting an existential forgery under 
an adaptively Ghosen-Message Attack (GMA) within time t. We denote this 
probability e as Succ2"“(A). 

3.4 DifRe-Hellman Problems 

Gomputational DifRe-Hellman Assumption (GDH). Let G be a cyclic 
group < g > of prime order q and xi, X 2 chosen at random in Zg. A (T, e:)-GDH- 
attacker in G is a probabilistic Turing machine A running in time T that given 
outputs ^^ 1^2 probability at least e. We denote this probability 
by Succg‘^^(Z\). The GDH problem is (T, £)-intractable if there is no (T,e)- 
attacker in G. 



Group Gomputational DifRe-Hellman Assumption (G-GDH). Let G be 

a cyclic group < g > of prime order q and a polynomial-bounded integer n. Let 
In be {1, ... , n}, V{In) be the set of all subsets of /„ and T be a subset of V{In) 
such that ^ r. 

We define the Group Diffie-Hellman distribution relative to T as: 

G-CDHr = { I x= (xi,...,x„) GflZ”}. 

If r = V{I)\{In}, we say that G-CDHp is the Full Generalized Diffie- 
Hellman distribution [9,23,30]. 

Given P, a (T, £)-G-GDH/— attacker in G is a probabilistic Turing machine 
A running in time T that given G-CDHp outputs • ain -^^ith probability at 
least £. We denote this probability by (A) . The G-GDHr problem is 

(T, £)-intractable if there is no (T, £)-G-GDH/--attacker in G. 
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Random Self-Reducibility of CDH and G-CDH. In a prime-order group 
G, the CDH and G-CDH are random self-reducible problems [23]. Informally, 
this property means that solving the problem on any original instance T> can 
be reduced to solving the problem on a random instance T>' . This requires an 
efficient way to generate the random instances T>' from the original instance T> 
and an efficient way to compute the solution to the problem on T>' from the 
solution to the problem on T>. 

Certainly the most common is the additive random self-reducibility of the 
CDH and G-CDH problems. We examp lify this property for the G-CDH problem. 
Given, for example, an instance T> = ( 5 “, for any a, &, c it is 

possible to generate a random instance 

T>' = ^ ^(c-1-7) ^ g(a+a-).(b+(}) ^ g(b+ 0 ).{c+j) ^ g{a+a) 

where a, f3 and 7 are random numbers in Z^; however the cost of such a com- 
putation may be high. And given the solution z = ^(“+“) (^+/5) (c-i- 7 ) the 
instance V it is possible to recover the solution to the random instance 
V (i.e. 3 “'’= = It is, in ef- 

fect, easy to see that such a reduction works only if T> is the Full Generalized 
DH distribution and that its cost increases exponentially with the size of T>. 

The other one is the multiplicative random self-reducibility of the CDH and 
G-CDH problems. The property holds if G is a prime-order cyclic group. We 
examplify this property for the G-CDH problem. Given, for example, an in- 
stance T> = 5 “'^) for any a,b,c it is easy to generate a random in- 
stance V = ^ g^P ^ gO-baf) ^ gaca'i'^ where a, (3 and 7 are random numbers in 

Z*. And given the solution the instance V it is easy to see that 

the solution to the random instance T> can be efficiently computed (i.e. 
gabc _ (^gaab()c-i^i°^P'i) Such a reduction is efficient and only requires a linear 
number of modular exponentiations. 



Adversary’s Resources. The security is formulated as a function of the 
amount of resources the adversary A expends. The resources are: 

— T-time of computing; 

— qs,qr,qc,Qs,QR,Qj numbers of Send, Reveal, Corrupt, Setup, Remove and 
Join queries the adversary A respectively makes. 

By notation Adv(T, . . .) or Succ(T, ...), we mean the maximum values of 
Adv(A) or Succ(A) respectively, over all adversaries A that expend at most the 
specified amount of resources. 

4 A Secure Authenticated Group DifRe-Hellman Scheme 

In the following theorem and proof we assume the random oracle model [ 6 ] and 
denote H a hash function from { 0 , 1 }* to { 0 , 1 }^, where £ is a security parameter. 
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u. 


U 2 


U 3 


u. 


^1 [1, q - 1] 

Xi := {q,9®l} 
Fij := {I||Xi} 


X 2 ^ [ 1 , q - 1 ] 

l^h]ui 


X3 £ [1, g _ 1] 


334 [i- g - 1] 


V(Fii) i True 
X2 := {9®2,gTi_gXi*2} 
FI 2 ;= {X||X2> 


[^h]U 2 














X3 := {q‘'2*3, 


V{Flo) = True 
Fig := {X||Xg} 


[^^3]t/g 






X 4 := {g®2^3®4 , 


V(Fig) = True 
4 , g^l®2®4 , g*^l®2^3 } 
Fi4 := {XIIXIIX4} 

[PUlt/4 






Ti4l(/4 






T4IC/4 






V{FU) = True 

K ;= 

sk^j^ := 'H{T\\Fl 4 \\K) 


V{FIa) = True 
K := (g^l®3^4)^2 
:= 1K(X||FZ4||K) 


V{FIa) = True 
K := (g^l®2^4)^3 
«fct/g 'H{Z\\Fl 4 \\K) 


K := (g^l^2®3 )®4 
sk^^ := ■H(X||Fi4||X) 



Fig. 1 . Algorithm SETUP 1 . An example of an honest execution with 4 players: J = 
{Ui, U2, t/3, U4}. The multicast group is T = {f/i, 172, t/3, Lfj} and the shared session 
key is sk = The partner IDS for Ui is pidsu^ = {U2, U3, Ui}, for 

U2 is pidsu2 ~ {Ui, 1/3,114}, for U3 is pidsus ~ {Ui, f/2, U4} and for U4 is pidsu^ = 
{Ul,U 3 ,U 4 }. 



The session-key space SK associated to this protocol is {0, 1}^ equipped with 
a uniform distribution. The arithmetic is in a finite cyclic group G =< g > oi 
order a fc-bit prime number q and the operation is denoted multiplicatively. This 
group could be a prime subgroup of Z*, or it could be an (hyper)-elliptic curve 
based group. 



4.1 Description 

The AKEl protocol consists of the SETUP 1, REMOVE 1 and JOINl algorithms. 
As illustrated by an AKEl execution in Figures 1, 2 and 3 (an execution with 
more steps can be found in the full version [10]), this is a protocol wherein the 
players are arranged in a ring, and wherein each player saves the set of values 
it receives in the down-flow of SETUPl, REMOVEl, JOINT In effect, in the 
subsequent removal of players from I any player U could be selected as Uqc 
and so will need these values to execute REMOVEl. 

Unlike [3], this is a protocol wherein the player with the highest-index in 
T is the group controller, the flows are signed using the long-lived key LL[/, 
the names of the players are in the protocol flows, and the session key SK is 
sk = F^max(l) is the down-flow, SIDS and PIDS are 

appropriately defined. The notion of index models “pre-existing” relationships 
among players: for example, it may capture different levels of reliability (i.e. 
the higher the index is, the more reliable the player). This is also a protocol. 
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u. u. u, u. 

e [1, g — 1] X2 e [1, g - 1] 0:3 e [1, g — 1] e [1, g - 1] 

Previous set of values is X 4 = {g^2^3^4 , g^l^3^4 , g^l^2^4 , g^l^2^3 } 

^3 [1. g - 1] 

X/ _ ^3) ^ gX^X 2 X 4 ^y 

FI 3 := (X 4 ||X||X^) 

[^^ 3 lt /3 

V{Fl3) = True 

K = (g®2®3®4)®l = (h^3)^l K = (g®1^2^3®4 )®3 ®3 = (h^l)^3 

8kjj^ :=n(Ui\\U3\\Fl^\\K) sk^ :=n(Ui\\U3\\Fl2\\K) 

where h = g^2®4 



Fig. 2. Algorithm REMOVEl. An example of an honest execution with 4 players: X = 
{Ui,U 2 ^U 3 ^U 4 }^ J = {U 2 jU 4 }- The new multicast group is T = {Ui^Us}^ Ugc — U 3 
and the shared session key is sk = the partner IDS for Ui is 

pidsui = for U3 is pidsu 3 = {Ui}. 















xi e [1, g — 1] ccg e [1, g — 1] 

Previous set of values is = {h^3 , /i^l }, where h = g^2^4 



,y,x'^ixU 



XW -.= {h 






V(FZ 4 ) = True 
K = (h®3 ®4)^1 
skjj^ := mXWFl^^WK) 



®3 ■*— [1) g — 1 ] 

h^l , h^l^^3 
F^3 := (X' ||X||X'') 



[i,g-i] 









: {h-^3 



V{Fl^) = True 

4 , ?i,^1^4 , } 

F4 ;= (X^||X||X') 
[^Ulf/4 



K = (h 



V(FJ 4 ) = True 
^4 



X = (h‘^1'^3 )-^4 
sfc^; ;= -WCXIIF^IIX) 



Fig. 3. Algorithm JOINl. An example of an honest execution with 4 players: X = 
{UijUs}, J = {Ui} and Uac = U3. The new multicast group is X = {Ui,U3,U4} 
and the shared session key is sk = H{X\\Fl4\\g^^^^^3Ui^4,)'j 'pj^g partner IDS for Ui is 
pidsui — {C/sjLfi}, for U3 is pidsu^ = {U\,U4} and for U4 is pidsu^ = {Ui,U3}. 



unlike [3], where the set of values from the down-flow is included in the flows of 
REMOVEl and JOINl, which avoids replay attacks. 



Algorithm SETUPl. The algorithm consists of two stages: up-flow and down- 
flow. The multicast group X is set to J. As illustrated by the example in Figure 1, 
in the up-flow the player Ui receives a set (Y, Z) of intermediate values, with 

Y = where Z = g^l 0<t<i _ 

0<m<i 
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Player Ui chooses at random a private value Xi, raises the values in Y to the 
power of Xi and then concatenates with Z to obtain his intermediate values 

Y' = y where Z' = = gno<t<i®*. 

0<m<i 

Player Ui then forwards the values (P', Z') to the next player in the ring. The 
down-flow takes place when Ujnax(i) receives the last up-flow. At that point 
Ujnax{x) performs the same steps as a player in the up-flow but broadcasts the 
set of intermediate values Y' only. In effect, the value Z' computed by Umax{x) 
will lead to the session key sk, since Z' = g^o<t<n^t ^ Players in X compute sk 
and accept. 

Algorithm REMOVEl. This algorithm consists of a down-flow only. The 
multicast group X is first set to X\J. As illustrated in Figure 2, the group 
controller Ucc (be. player with the highest-index in X\J^) generates a random 
value x'q(^ and removes from the saved previous broadcast the values destinated 
to the players in Uqc then raises all the remaining values in which xgc 
appeared to the power of {xQ^^.XQfy) and broadcasts the result, {xgc is Ugc’s 
previous secret value.) Players in X compute sk and accept. Players in J' erase 
any internal data. Ugc erases xgc and Xqq while internally saving Xgc- 

Algorithm JOINl. This algorithm consists of two stages: up-flow and down- 
flow. As illustrated in Figure 3, the group controller Ugc (i-e. player with the 
highest-index in X) generates a random value Xqgi raises the values from the 
saved previous broadcast in which xgc appears to the power of (xq^.x'gc) and 
obtains a set of values Y'. {xgc is Ugc’s previous secret exponent.) Ugc also 
computes the value Z' by raising the last value in Y' to x'q(^. Ui then forwards 
the values {Y' ,Z') to the first joining player in J. From that point JOINl will 
work as the SETUPl algorithm. Upon receiving the brodcast flow players in 
Xyj J erase previous session keys, compute sk and accept. The multicast group 
X is then set to X U J'. 

4.2 Security Result 

Theorem 1. Let P be the AKEl protoeol, SK he the session-key space and Q he 
the associated LL-key generator. Let A he an adversary against the AKE security 
of P within a time hound T, on a multicast group of size s among the n players 
in U, after Q = QsYQjYQr interactions with the parties, qg send-queries and 
qh hash-queries. Then we have: 

Q, qg, g,.) < 2g • • s • (T') + 2n • SuccS"“(T', Q + qg) 

where T' < T -\- {Q -\- qg)nTexp{k) ; Texp{k) is the time of computation required 
for an exponentiation modulo a k-hit number and Pg corresponds to the elements 
the adversary A can possibly view: 
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^ 3 = U 

2<j<s-2 

U{{ i\l < i < s,i ^ k,l} 1 1 < fc, / < s} . 



Let us just highlight the main ideas. We consider an adversary A attacking 
the protocol P and then “breaking” the AKE security. A would have carried out 
her attack in different ways: (1) she may have gotten her advantage by forging 
a signature with respect to some player’s long-lived public key. We will then use 
A to build a forger by “guessing” for which player A will produce her forgery, 
(2) she may have broken the scheme without altering the content of the flows. 
We will use it to solve an instance of the G-CDH problem, by “guessing” the 
moment at which A will make the Test-query and by injecting into the game the 
elements from the instance of G-CDH received as input. 

To work (2) requires two things. We first “guess” the moment of the Test- 
query which means that we have to “guess”: the number of operations that 
will occur before the adversary makes the Test-query and the membership of the 
multicast group when the adversary makes the Test-query. Second, based on this 
guess we ’’embed” the instance of G-CDH into the protocol. We generate many 
random instances from the original instance of G-CDH using the (multiplicative) 
random self-reducibility property of the G-CDH problem^. Indeed, the group 
Diffie-Hellman secret key relative to these random instances can efficiently be 
computed from the group Diffie-Hellman secret relative to the original instance. 

The specific structure of Ps (see figure 4 for I4) makes the simulation per- 
fectly indistinguishable from the adversary point of view if our guesses are all 
correct. 



i = o 

i = i El 



7=2 




{1} 


{2} 




7 = 3 (= s - 1) 


{1,2} 


{1,3} 


{2,3} 


||1,4}||2,4}||3,4}| 


i = 4(=s) 111,2,3} 


{1,2,4} 


{1,3,4} 


{2,3,4} 





basic trigon extension 



Fig. 4. Extended Trigon for T4 



But then, because of the random oracle 74, to have any information about the 
session key the adversary wants to test, she has to have asked for H{l\\Fliast\\K), 

^ The multiplicative random self-reducibility will lead to a far more efficient reduction 
than the additive one would do. 
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where K is the value we are looking for. Therefore, if the adversary has some 
advantage in breaking the AKE security, this value K can be found in the list of 
the queries asked to Ti. The details of the simulation can be found in appendix A. 

4.3 AKEl in Practice 

We want our results to be practical. This means that when system designers 
choose a scheme they will take into account its security but also its efficiency in 
terms of computation, communication, ease of integration and so on. However, 
if provable security is achieved at the cost of a loss of efficiency, system designers 
will often prefer the heuristic schemes. 

AKEl is to date the first group Difhe-Hellman scheme to exhibit a proof that 
it achieves a strong notion of security. It is secure in the random oracle model 
under the G-CDH assumption. It thus provides stronger security guarantees than 
other schemes [3,12,18] while being more efficient than [3]. However security 
proofs for existing schemes or slight variants may show up. 

On the integration front, the question that may be raised is what happens 
when several groups merge to form a larger group. A scenario that occurs in 
practice when a network failure partitions the multicast group in several disjoints 
sub-groups which will later need to merge when the network is be repaired [1]. 
The most efficient way in terms of computation and communication is to add 
players from the smaller sub-groups into the largest of the merging sub-groups. 
That is, Uqc is chosen as the player with the highest-index in the largest merging 
sub-group and the players from the smaller sub-groups are added via the JOINl 
algorithm. 



5 Mutual Authentication 

The well-known approach [5] for turning an AKE protocol into a protocol that 
provides mutual authentication (MA) is to use the shared session key to construct 
a simple “authenticator” for the other parties. We have described in [11] the 
transformation for turning an AKE group Difhe-Hellman scheme into a protocol 
providing MA and justified its security in the random-oracle model. We turn 
an AKE dynamic group Difhe-Hellman scheme into a protocol providing MA by 
simply applying the transformation MA described in [11] to the setup, join and 
remove algorithms respectively. 



6 Conclusion and Further Research 

This paper provides the hrst formal treatment of the authenticated group Difhe- 
Hellman key exchange problem in a scenario in which the membership is dynamic 
rather than static. Addressed in this paper were two security goals of the group 
Difhe-Hellman key exchange: the authenticated key exchange and the mutual 
authentication. For each we presented a dehnition, a protocol and a security 
proof in the random oracle model that the protocol meets its goals. 
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The model introduced in this paper captures attacks that are realistic threats 
in practice. However the model does not yet capture “more serious” attacks: in- 
deed, it does not recognize multiple player’s instances the adversary may activate 
in concurrent and simultaneous sessions. A typical research topic is to enhance 
our model to capture these attacks and to investigate in this more stringent 
setting the security of the protocols presented in this paper. We are currently 
extending our model to encompass these attacks. 

The security reduction presented for AKEl in this paper does not inject much 
of the security of the group computational Diffie-Hellman problem and signature 
scheme: actually, the reduction is exponential in s. This leads one to use a larger 
security parameter or to limit the maximum size of the group. Another research 
direction is to find a security proof that would achieve a better security bound. 
We believe it is possible and are currently working on it. 

Acknowledgements. The authors thank Deborah Agarwal and Jean- Jacques 
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paper. The authors also thank the anonymous referees for their useful comments. 



References 

1. D. A. Agarwal, O. Chevassut, M.R. Thompson, and G. Tsudik. An Integrated 
Solution for Secure Group Communication in Wide-Area Networks. In Proc. of 
6th IEEE Symposium on Computers and Communications, 2001. 

2. D. A. Agarwal, S. R. Sachs, and W. E. Johnston. The Reality of Collaboratories. 
Computer Physics Communications, 10(issue l-3):pages 270-299, coverdate May 
1998. 

3. G. Ateniese, M. Steiner, and G. Tsudik. New Multiparty Authentication Services 
and Key Agreement Protocols. IEEE Journal of Selected Areas in Communica- 
tions, April 2000. 

4. K. Becker and U. Wille. Gommunication Gomplexity of Group Key Distribution. 
In 5th ACM Conference on Computer and Communications Security, pages 1-6, 
November 1998. 

5. M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated Key Exchange Secure 
Against Dictionary Attacks. In B. Preneel, editor, Proc. of Eurocrypt ’00, volume 
1807 of Lecture Notes in Computer Science, pages 139-155. Springer- Verlag, 2000. 

6. M. Bellare and P. Rogaway. Entity Authentification and Key Distribution. In 
D.R. Stinson, editor, Proc. of Crypto ’93, Lecture Notes in Computer Science. 
Springer- Verlag, 1993. 

7. M. Bellare and P. Rogaway. Random Oracles are Practical: a Paradigm for De- 
signing Efficient Protocols. In Proc of ACM CCS ’93. ACM Press, 1993. 

8. M. Bellare and P. Rogaway. The Exact Security of Digital Signatures: How to sign 
with RSA and Rabin. In U. Maurer, editor, Proc of Eurocrypt ’96, Lecture Notes 
in Computer Science. Springer- Verlag, 1996. 

9. D. Boneh. The Decision Diffie-Hellman Problem. In Third Algorithmic Number 
Theory Symposium, volume 1423 of Lecture Notes in Computer Science, pages 
48-63. Springer- Verlag, 1998. 




Provably Authenticated Group Difiie-Hellman Key Exchange 



305 



10. E. Bresson, O. Chevassut, and D. Pointcheval. Provably Group Difhe-Hellman Key 
Exchange - The Dynamic Case. Technical report, December 2001. Full version of 
this paper, available at http://www.di.ens.fr/~pointche. 

11. E. Bresson, O. Chevassut, D. Pointcheval, and J. J. Quisquater. Provably Group 
Diffie- Heilman Key Exchange. In Proc. of 8th ACM Conference on Computer and 
Communications Security, Nov 2001. 

12. M. Burmester and Y. Desmedt. A Secure and Efficient Conference Key Distribution 
System. In A. De Santis, editor, Proc of Eurocrypt’ 9f, volume 950 of Lecture Notes 
in Computer Science, pages 275-286. Springer- Verlag, 1995. 

13. R. Canetti, J. Garay, G. Itkis, D. Micciancio, M. Naor, and B. Pinkas. Issues in 
Multicast Security: A Taxonomy and Efficient Constructions. In Proc. of INFO- 
COM ’99, March 1999. 

14. R. Canetti, O. Goldreich, and S. Halevi. The Random Oracle Methodology, Re- 
visited. In Proc of. Symposium on the Theory of Computing (SOC). ACM, March 
1998. 

15. E. Fujisaki, T. Okamoto, D. Pointcheval, and J. Stern. RSA-OAEP is Secure under 
the RSA Assumption. In Proc of. Crypto’Ol, August 2001. 

16. I. Ingemarsson, D. Tang, and C. Wong. A Conference Key Distribution System. In 
IEEE Transactions on Information Theory, volume 28(5), pages 714-720, Septem- 
ber 1982. 

17. M. Jakobsson and D. Pointcheval. Mutual Authentication for Low-Power Mobile 
Devices. In Proc. of Financial Cryptography ’2001, 2001. 

18. M. Just and S. Vaudenay. Authenticated Multi-Party Key Agreement. In Proc. of 
ASIACRYPT’96, volume 1163 of Lecture Notes in Computer Science, pages 36-49. 
Springer- Verlag, 1996. 

19. Y. Kim, A. Perrig, and G. Tsudik. Simple and Fault-Tolerant Key Agreement 
for Dynamic Collaborative Group. In Proc. of ACM Conference on Computer and 
Communications Security (CCS-7), November 2000. 

20. Y. Kim, A. Perrig, and G. Tsudik. Communication-Efficient Group Key Agree- 
ment. In Proc. of International Federation for Information Processing (IFIP SEC 
2001), June 2001. 

21. S. McCanne and V. Jacobson, vie: A Flexible Framework for Packet Video. In 
ACM Multimedia ’95, pages 511-522, November 1995. 

22. L.E. Moser, P.M. Melliar-Smith, and P. Narasimhan. Consistent Object Replica- 
tion in the Eternal System. Theory and Practice of Object Systems, 4(2):pages 
81-92, 1998. 

23. M. Naor and O. Reingold. Number-Theoretic Constructions of Efficient Pseudo- 
Random Functions. In Proc. of 38th IEEE FOCS Symposium, pages 458-467, 
1997. 

24. O. Pereira and J. J. Quisquater. A Security Analysis of the Cliques Protocols 
Suites. In If-th IEEE Computer Security Foundations Workshop. IEEE Computer 
Society Press, June 2001. 

25. D. Pointcheval. Secure Designs for Public-Key Cryptography based on the Discrete 
Logarithm. To appear in Discrete Applied Mathematics, Elsevier Science, 2001. 

26. D. Pointcheval and J. Stern. Security Arguments for Digital Signatures and Blind 
Signatures. J. of Cryptology, 13(3):361-396, 2000. 

27. V. Shoup. Lower Bounds for Discrete Logarithms and Related Problems. In 
W. Fumy, editor, Proc. of Eurocrypt ’97, volume 1233 of Lecture Notes in Computer 
Science, pages 256-266. Springer- Verlag, 1997. 




306 



E. Bresson, O. Chevassut, and D. Pointcheval 



28. D. Steer, L. Strawczynski, W. Diffie, and M. Wiener. A Secure Audio Teleconfer- 
ence System. In S. Goldwasser, editor, Proc. of Crypto’ 88, volume 403 of Lecture 
Notes in Computer Science, pages 520-528. Springer- Verlag, 1988. 

29. M. Steiner, G. Tsudik, and M. Waidner. Key Agreement in Dynamic Peer Groups. 
In IEEE Transactions on Parallel and Distributed Systems, August 2000. 

30. M. Steiner, G. Tsudik, and M. Waidner. DifBe-Hellman Key Distribution Extended 
to Groups. In ACM CCS’96, March 1996. 

31. Wen-Guey Tzeng. A Practical and Secure Fault-Tolerant Conference- Key Agree- 
ment Protocol. In Proc. of PKC2000, Lecture Notes in Computer Science. Springer- 
Verlag, 2000. 

A Proof of Theorem 1 

Let A be an adversary that can get an advantage e in breaking the AKE security 

of protocol P within time T. We construct from it a (T", £")-forger T and a 

(T', e')-G-CDH/-^-attacker A. 



Forger iF. Let’s assume that A breaks the protocol P by forging, with proba- 
bility greater than ly, a signature with respect to some player’s (public) LL-key 
(Of course before A corrupts U). We construct from it a (T", £")-forger T which 
outputs a forgery (a, m) with respect to a given (public) LL-key Kp, produced 
by This forger works exactly as in [11]. A detailed description can be 

found in the full version [10]. 

G-CDHr^-attacker A. Let’s assume that A breaks the protocol P without 
producing a forgery. Here, with probability smaller than v, the (valid) flows 
signed using LL;/ come from player U and not from A (Of course before A 
corrupts U). The replay attacks involving the flows of JOINl and REMOVEl 
do not also need to be considered since the values from the previous broadcast 
are included in these flows. One may then worry about replay attacks against 
SETUP 1, however SETUP 1 has already been proved to be secure for concurrent 
executions by Bresson et al. [11]. 

We now construct from A a (T', £')-G-CDHp^-attacker A that receives as 
input an instance T> of G-GDH/^^ with random size s and outputs the Diffie- 
Hellman secret value (i.e ^^i-'-^o) relative to this instance. More precisely, a 
G-GDHp^ with size s G [l,n] and Pg of the form 

Ts = U {{ill < i < J,i 9^ 0 II < ^ < j} 

2<j<s-2 

y_]{{i!l < i < s,i ^ 

This in turn leads to an instance V = (Ai, S 2 , ■ ■ ■ , Ss- 2 , 5's-i, Sg) wherein: 
Sj, for 2 < j < s — 2 and j = s, is the set of all the j — 1-tuples one can build from 
{1, . . . , j}; but S's-i is the set of all s — 2 tuples one can build from {1, . . . , s}. 

The aim of the simulation is to have all the elements of Sg , embedded into the 
protocol when the adversary A asks the Test-query. In this case, A will not be 
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able to get any information about the value sk of the session key without having 
previously queried the random hash oracle Ti. on the Difhe-Hellman secret value 
gXi - Xe^ Thus, to break the security of P the adversary A would have to have 
asked a query of the form H{1, Fliast, which as a consequence will be in 

the list of queries asked to Ti. 

To reach this aim A has to guess several values: cq, Tq and zq. We now describe 
what these values are used for and we will return to the formal simulation later 
on. 

A first picks at random in [1, Q] the number of operations cq that will occur 
before A asks the Test-query and embeds the elements of Sg into the operation 
that will occur at cq. However A can not embed all the elements of Sg at cq since, 
contrary to SETUPl, in JOINl and REMOVEl the players are not all added to 
the group a,t Cq. A rather embeds the elements from Si to Sg in the order the 
players are added to the group^ but only for the players that will belong to the 
group at Cq. Thus, A also chooses at random s index- values ui through Ug in 
[l,rz] that it hopes will make up the group membership at cq. 

A also needs to cope with protocol executions wherein the players Ui, 1 < 
i < s, are repeatedely added and removed from the group in order to have 
several times before reaching cq the group membership be Iq- in effect, A 
embeds all the elements of Sg into the protocol execution the first time the 
group membership is Iqj ^ is neither able to compute the Difhe-Hellman secret 
value involved nor the session key value sk needed to answer to the Reveal-query. 

To be able to answer, A does not in fact embed Sg into the broadcast flow 
of the operation which updates the group membership to be Xq but embeds 
truly random values. A guesses the player Uig from Xq who will embed Sg into 
the broadcast flow of the operation that occurs at cq^ but generates a truly 
random exponent and uses it to embed truly random values for the operations 
that occur before Cq and after cq. The index zg is set as follows. If the Cg-th 
operation is JOINl then zg is the last joining player’s index, otherwise zg is the 
group controller’s index max(Xg). 

We now show that the above simulation and the random self-reducibility of 
G-CDH allows A to answer all the queries until A asks the Test-query at cg. 
Since A embeds elements of Si when a player Ui from Xg (except Utg) is added to 
the group and A does not remove it when Ui leaves, each protocol flow consists 
of a random self-reduction on one line (line 0, i.e. S'g down to line s — 1, i.e. ^-i) 
of the basic trigon. The trigon is illustrated on Figure 4. Thus, A can derivate 
the value sk of the session key from one of the values in the line below (line 1, 
i.e. S\ up to line s, i.e. Sg). 

However, A also needs to be able to answer to all queries after cg and more 
specifically the Reveal-queries. To this aim, A has to un-embed the element Sg 
from the protocol and do it in the operation that occurs at cg -I- 1. However 

® More precisely, A keeps in some variable T the order of arrival for the players in Xg, 
in order to know which elements of the trigon have to be used for each player. The 
variable T is reset whenever a Setup occurs. 

^ A may also embed a self-reduced element generated from Sg into the broadcast flow. 
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depending on the operation that occurs at cq + 1, A may not be able to do 
it for player Uig. This is the reason why the line Sg-i has to contain all the 
possible (s — 2)-tuples: extension of the basic trigon illustrated on Figure 4. For 
the operations that will occur after cq + 1, A uses truly random exponents for 
all the players including those in Jg. Thus, after cq + 1 all the protocol flows 
involve elements in Sg-i and Sg only. 

This brief description completes the proof. The full behavior of the simulator 
is on Figure 5, with example on Figure 6. The probability analyses can be found 
in the full paper [10]. 



Setup(J") 


Reset T to 0 
Increment c 
Update T ■<— T 
u ^ min( JT”) 

• c < Co : u € To, M 7^ *0 simulate using RSR according to T 

• c = Co : fT” yf To => output “Fail” 

J = To,u = io ^ simulate using RSR according to T 
Else proceed as in P using r„-£- Z* 


Join(J') 


Increment c 
u ^ max(P) 

Update P <— P U 

• c < Co : M € Po, w 7^ *0 simulate using RSR according to T 

• c = Co : P 7^ Po V max(P) 7^10=^ output “Fail” 

T — To ^ simulate using RSR according to P 
Else proceed as in P using r„-£- ZJ 


Remove(j7) 


Increment c 
Update P <— T\J 
u ^ max(P) 

• c < Co : M € Po, w 7^ *0 simulate using RSR according to T 

• c = Co : P 7^ Po ^ output “Fail” 

P = Po simulate using RSR according to T 

Else proceed as in P using Z* 


Send(?7i, m) 


• c < Co : i e Po, * 7^ *0 ^ simulate using RSR according to T 

• c = Co : * € Po => simulate using RSR according to T 
Else proceed as in P using ViA- Z* 



Fig. 5. Game“*^®(.4, P). The multicast group is T. The Test-query is “guessed” to be 
made; after co operations, the multicast group is To, and the last joining player is Uig. 
In the variable T, A store which exponents of instance P have been injected in the 
game so far. RSR holds for random self-reducibility. 
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Setup{C/i,t/ 2 ,C/ 3 } 


c = 0,T = {1,2,3},SK = ( 5 “T^ 


is known to A 
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gab ^ gar. 










Remove! t/ 2 } 


c= 1 ,T = {1,3},SK = ( 3 “^')’'^ 


is known to A 
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Join{?74} 



c = 2,T= {1,3,4},SK = is known to Z\ 






gr,b" ^ gar, ^ gar,," 



„ar,b" 



Join{[/2} 



c = 3,1 = {1, 2, 3, 4}, SK is the DH secret 



r,db" c' 



„r,b"c' ar,c' 

— „ „ , f/4 

gar20 gar^b c 



Test-query guessed now 



ar2b”c' 



a<ir2 c' 



adr2b" 



broadcast sent by U 2 


Remove! 17i, U 2 } 


c = 4,/r = !3, 4}, SK = (<;“*’ ^^’’ 21-4 jg j^nown to A again 




adr2V4 adr2b" 

y 5 y 






Join!172} 


C= 5,T= !2,3,4},SK = {g<^b"dy,r'^r^r'^ ^ 



adr2r4v'^ adr2b" adr2b" r^r'^ 

y 

gadr2b" r^r'^ ^adr2r2'^4 



U4 



adr2r'2b'' 



Fig. 6. An example of an execution of the protocol P=AKE1 with the adversary. We 
represent the simulation by A according to the following “guesses”: co = 3, s = 4,lTo = 
{1, 2, 3, 4}, io = 2. We denote by b, b' , b" etc. some blinding exponents used in the self- 
reduction of G-CDH (think b” as being 6/3", e.g.). Also note that when rejoining the 
group at steps c = 3 and c = 5, U 2 does not “remove” its random exponent. 
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Abstract. The aim of this article is to propose a fully distributed 
environment for the RSA scheme. What we have in mind is highly 
sensitive applications and even if we are ready to pay a price in terms of 
efficiency, we do not want any compromise of the security assumptions 
that we make. Recently Shoup proposed a practical RSA threshold 
signature scheme that allows to share the ability to sign between a set 
of players. This scheme can be used for decryption as well. However, 
Shoup’s protocol assumes a trusted dealer to generate and distribute 
the keys. This comes from the fact that the scheme needs a special 
assumption on the RSA modulus and this kind of RSA moduli cannot 
be easily generated in an efficient way with many players. Of course, it is 
still possible to call theoretical results on multiparty computation, but 
we cannot hope to design efficient protocols. The only practical result to 
generate RSA moduli in a distributive manner is Boneh and Franklin’s 
protocol but it seems difficult to modify it in order to generate the kind 
of RSA moduli that Shoup’s protocol requires. 

The present work takes a different path by proposing a method to 
enhance the key generation with some additional properties and revisits 
Shoup’s protocol to work with the resulting RSA moduli. Both of these 
enhancements decrease the performance of the basic protocols. However, 
we think that in the applications we target, these enhancements provide 
practical solutions. Indeed, the key generation protocol is usually run 
only once and the number of players used to sign or decrypt is not very 
large. Moreover, these players have time to perform their task so that 
the communication or time complexity are not overly important. 

Keywords: Threshold RSA key generation and signature 



1 Introduction 

The cryptosystem RSA [34] is widely used in today practical systems. For in- 
stance, a lot of PKI products are based on it. In such systems, the protection of 
the root key needs strong security requirements. Therefore, threshold protocols 
can be used to share the signature capabilities among a subset of people rather 
than to give the power of signing to only one person. Moreover, this kind of 
protocols can withstand stronger adversaries than “centralized” cryptosystems. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 310-330, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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Indeed, threshold cryptography can cope with break-ins adversaries that have 
the ability to corrupt people and read the memory of servers [6]. These ad- 
versaries are stronger than “normal” adversaries that can only read exchanged 
messages. In a “centralized” cryptosystem, if one break-ins adversary attacks 
the memory, he then knows all the key and the system is down. As this kind 
of attacks done by intruders (hackers, Trojan horses) or by corrupted insiders 
are very common and frequently easy to perform, systems must be protected 
against them. In threshold cryptography, the secret key is split into shares and 
each share is given to one of a group of servers. However, in order to be sure that 
at no moment the key is entirely in one machine, one can also distribute the key 
generation phase. Consequently, we say that a cryptosystem is fully distributed 
if it is distributed from the key generation to the signature or decryption phase. 

In the case of discrete- log based cryptosystems, known solutions exist to 
distribute DSS, El Carnal, Cramer-Shoup [20,38,7]. Moreover, a protocol to dis- 
tribute a discrete-log key has been first proposed by Pedersen in [27]. This proto- 
col has been further revisited to solve a security flaw [21,15]. Therefore, discrete- 
log cryptosystems are fully distributed. However, a fully distributed version of 
RSA is a more challenging and important task. 

In this paper we propose new techniques to fully distribute RSA. This solves 
an open problem where one needs to cope with requirements that do not match. 
On one hand, at Eurocrypt’OO, Shoup describes a practical threshold signature 
scheme in [37] where the primes of the RSA modulus should be safe. On the other 
hand, Boneh and Franklin at Crypto ’97 [4] describe a protocol to share the key 
generation of an RSA modulus. However, the generation of safe modulus seems to 
be hard with this protocol. The present work takes a different path by proposing 
a method to enhance the key generation with some additional properties and 
revisits Shoup’s protocol to work with the resulting RSA moduli. 

1.1 Why It Is Important to Share Shoup’s Threshold RSA ? 

Shoup threshold RSA signature scheme [37] presents interesting features. First of 
all, it is secure and robust in the random oracle model assuming the RSA problem 
is hard. Next, the signature share generation and verification are completely non- 
interactive and finally, the size of an individual signature share is bounded by 
a constant times the size of the RSA modulus. However, this scheme requires a 
trusted dealer to generate the keys and distribute the shares of the secret key 
among i servers. 

When a message m has to be signed by a quorum of at least t -I- 1 servers, 
where 2t -|- 1 < ^, a special server, called the combiner, forwards the message 
m or X = H{m) to all servers. Then, each server computes its signature share 
along with a proof of correctness. Finally, the combiner selects a subgroup of 
t -\- 1 servers by checking the proofs and combines the t -\- 1 related signature 
shares to generate the signature s. 

Efficient communication model against active adversary. The main char- 
acteristic of Shoup’s protocol in relation to previous proposals [17,16,32] is the 
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following. In the discrete-log case, it is easy to compute inverses mod q, if we 
note q the order of the group G generated by g, because q is public. With RSA, 
we cannot disclose inverses of a known value mod f{N) without revealing the 
factorization of N unless we use a special algebraic structure, called a module, 
as in [35,19]. We can note that computations in such structure can be done effi- 
ciently if we consider [25] . If we do not want to use a module, we face the problem 
of computing inverses when we use polynomial sharing in order to compute the 
Lagrange coefficients. Consequently some authors in [17,32] have proposed ad- 
ditive sharings to avoid this calculation. Therefore, as they need all shares to 
generate the signature, they devise strategies to cope with corrupted or crashed 
servers. Different strategies can be used to reconstruct the lost shares by non- 
corrupted servers : either using two different sharings (additive and polynomial) 
as in [32] or using probabilistic assignments as in [17]. With additive sharing, 
d = rtiod <^{N), the combiner easily computes the signature s from the 

£ correct signature shares Si = mod N, where x = H (m) is the message to 
be signed, by using the formula : 

i / i 

s = Si I = = x'^ mod N 

i=l \ i=l 

The main drawbacks of these techniques are the size of the key shares because 
of the need of different sharings and the use of protocols to reconstruct the bad 
signature shares in the presence of active (malicious) players. 

In [16], the authors proposed the first proven scheme based on polynomial 
sharing, which is based on Desmedt and Frankel’s scheme [13]. However in the 
case of active adversaries, which are allowed to send bad shares, the protocol 
has to be rewind at most t times, to remove the bad servers as the signature 
shares depend on the subgroup of t -I- 1 servers enabling the reconstruction of the 
signature. Let A = t\. The shares of d are such that A\di and di = f{i) where / is 
a polynomial of degree t and of constant term equals to d. If we denote by S the 
subgroup of t -I- 1 servers, let Lagrange coefficients to be 

Therefore, d = Eq jdi mod (f{N) from the Lagrange formula. There are 

two problems. First of all, X'ij cannot be computed in since {j —j') could 

be even and not invertible modip(N). Next, the combiner has to compute 

S = f = = gd jsA ^2) 

ieS V ieS / 

But, as A'j j’s are not integers and the combiner cannot compute roots modulo a 
composite number, otherwise it can solve the RSA Problem, he cannot compute 
equation (2). The key idea is to note that A x are integers. Therefore, if 
we write di = A X d'i, then, for a group S' of t -I- 1 servers, each one can compute 
Iq i = A X X'q i G Z and s'i = x^ ' mod N. Finally, the combiner 

computes 
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s = = x’^ mod N j (3) 

ies V ies ies ies / 

If the signature is not valid, the combiner removes the bad servers thanks to a 
proof of robustness, defines another group S and rewind the protocol. It is then 
obvious that after t trials, all bad servers are removed from the set S and the 
signature will be correct. However, this redefinition of the subgroup does not 
seem very nice and Shoup and others have proposed a new trick to avoid this 
problem. 

Shoup in [37] and Miyazaki, Sakurai and Yung in [26] solve this problem 
by using a well-known lemma to extract an e-root of w modulo a composite 
number from a e-root of a known power of w [24] without any secret. The 
solution is to multiply Lagrange coefficients by A such that they are integers : 
= Ax X'fj e Z and Ad = denote by S' a subset of t -I- 1 

elements. Therefore, if we let Sj = x '^' , the combiner can compute signatures and 
change of group (compute new Lagrange coefficients according to the group of 
t -|- 1 servers) without asking new signature shares to the servers. He computes 

equation (2) using Xfj and gets mod N. Finally, the 

combiner can compute a e-root of x^ with the previous formula and can recover 
a e-root of x using the well-known lemma. Consequently, if we use Shoup’s 
scheme, there is no need to generate di such that A\di as it is done in [18,12]. 

Even if the protocol of Frankel et al. in [16] proposed a fully distributed 
version of RSA, it is less elegant than Shoup’s one and it will be nice to share this 
protocol. Moreover, this scheme proposes others improvements that are valuable 
such as the proof of robustness. This proof uses safe primes, like Gennaro et 
aids one [20] and avoids the drawbacks of a special relation between prover and 
verifier. Furthermore, in [16], the authors describe an interactive protocol which 
leads to a less efficient protocol than Shoup’s one which uses non-interactive 
zero-knowledge proofs. Therefore, we face the problem of distributing this non- 
interactive proof. 



Proof of robustness and use of safe primes. As we said before, a second key 
point in Shoup’s signature scheme is the proof of correctness, which guarantees 
the robustness of the scheme. Robustness means that corrupted servers should 
not be able to prevent uncorrupted servers from signing. This property is at- 
tractive for threshold protocols in presence of active adversaries that can modify 
the behavior of servers. In Shoup’s scheme, the proof of correctness requires an 
RSA modulus built with safe primes. In the proof of correctness, servers must 
prove that they raise x to the correct power, namely di, their share of the secret. 
To this end, each server i has a verification key Vi = v‘^* mod N and makes a 
proof that log„ = log,„Si(= di mod (p{N)). The problems are : Z^r* is not a 
cyclic group, its order is unknown, such generator v do not exist and elements 
of maximal order cannot be easily found. However, Shoup noted that if we use 
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RSA moduli with safe primes, then the group of squares in ’Em* is cyclic and it 
is easy to find generators. Consequently, the proof of correctness can be made 
non-interactively and correctly proved without further assumptions. Finally, safe 
prime moduli are also used in the key generation protocol in order to guarantee 
the secrecy of Shamir’s secret sharing. 

Shared Generation of RSA Keys. This raises the question of generating 
RSA moduli for Shoup’s threshold scheme without a trusted dealer. There exist 
protocols that generate RSA keys in a distributive manner [4,18,9,10,3,30,23]. 
Boneh and Franklin in [4] designed such protocol for the generation of an RSA 
modulus in the honest-but-curious model. Later, Frankel, MacKenzie and Yung 
in [18] made this algorithm robust against malicious servers. In [30], Poupard 
and Stern also provided a protocol to compute a shared modulus for two players 
only. Finally, Gilboa in [23] has extended Poupard and Stern method. As we can 
note, the Boneh and Franklin protocol is most efficient but no protocol is known 
to efficiently create shared safe RSA moduli. 

1.2 Outline of the Paper 

We begin by presenting the problem in section 2, i.e. where the properties of 
safe primes are used in Shoup’s protocol, and in section 3 the security model. 
Next, in section 4, we describe how to enhance the Boneh-Franklin scheme to 
generate RSA moduli having special requirements. In section 5 we show that 
Shoup’s protocol is still secure against passive adversary and in section 6, we 
show a new proof of correctness making Shoup scheme robust against active 
adversary. Finally, in section 7 we present practical parameters for our scheme. 

1.3 Notations and Definitions 

Throughout this paper, we use the following notation: for any integer N = pq, 
where n = log(iV) is a security parameter, as well as k, £, t, k' , k\ and k 2 , 

— we use Qn to denote the group of squares in Z^r*, 

— we use <p{N) to denote the Euler totient function, i.e. the cardinality of , 

— we use \{N) to denote Carmichael’s lambda function defined as the largest 
order of the elements of Em* ■ 

Let p = 2p' + 1 and q = 2q' + I where in general p' = f][p. p/* and q' = 
rig„ ^ — P'^' ■ Finally, a prime number p is a safe prime if p and p' are 

both prime. A RSA modulus N = pq is called a safe prime modulus if p and q 
are both safe primes. 

2 The Problem 

As we will see in the following, safe primes are used in the key generation in 
order to prove that Shamir secret sharing scheme [36] is secure in the ring Em, 
and not in a finite field, and in the proof of correctness. Let us explain the second 
problem as it is less obvious. 
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2.1 What Is the Problem? 

Robustness guarantees that even if t malicious players send false signature shares, 
the signature scheme still correctly generates a signature s. This property is 
needed since otherwise combination faces the problem of selecting the correct 
shares. 

For example, the combiner receives signature shares from the servers and has 
to generate the correct signature. One way for him is to pick at random t + 1 
signature shares, generate the possible signatures s' and test whether s' is a valid 
signature of m. If s' passes the verification protocol, the correct signature has 
been found, otherwise, the combiner has to test another group of t + 1 signature 
shares. Since the combiner cannot guess where the bad shares are, it might face 
an exponential number of trials. Therefore, it is necessary to devise an efficient 
test in order to check whether a player has correctly answered a request. Shoup 
has proposed an efficient proof to achieve such check non-interactively and the 
same kind of proof appears in [32,19] but still requires safe prime modulus. 

In order to avoid the generation of shared safe moduli, which appears cur- 
rently out of reach, this paper proposes a tradeoff between the requirements of 
the RSA modulus for the signature and decryption protocols and the require- 
ments at key generation. Independly of our work, Damgard and Koprowski have 
recently considered the same problem in [12]. They revisited Shoup’s paper and 
used non-standard assumptions to show that the proof of correctness works with- 
out other requirements on the RSA modulus. They use [18] to generate standard 
RSA moduli. 

In our work, we consider environments where high security is required such 
as electronic voting schemes. Therefore, we prefer to use protocols based on 
standard assumptions. We believe that standard assumptions and security proofs 
are needed to build secure protocols. Several electronic schemes [14,11,1] have 
been based on Paillier cryptosystem which is related to RSA. The techniques 
developed in the paper can be used to fully share this cryptosystem. 

2.2 Our Results 

We prove that Shoup’s protocol can be modified to work with RSA moduli 
having special properties under standard assumptions and that these moduli 
can be jointly generated. 

Safe prime moduli are needed in the proof of robustness and in the key 
generation. Moreover, different characteristics of these numbers are used in the 
proof of robustness. Indeed, Shoup’s protocol uses two important properties of 
the subgroup Qn of squares of Zn* when is a safe modulus. On one hand, 
this subgroup is cyclic and on the other hand, its order M does not have small 
prime factors. The cyclic group is used to show the existence of the discrete 
log in the proof of correctness. The use of safe primes allows to guarantee that, 
with overwhelming probability, a random element in Qat is a generator. 

Our first observation relates the structure oi Qm with gcd(p— 1, g — 1) and the 
search for generators in this group to the prime factor decomposition of and 
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In particular, if and have no small prime factors, then with high 
probability few randomly chosen elements generate the entire group Qn- If we 
choose enough random elements {gi, . . . ,gk), we can guarantee that the group 
generated by {gi,... ,gk) is all of Qn with high probability. Such techniques 
have already been used by Frankel et al. in [18] and a precise treatment has been 
given by Poupard and Stern in [31]. Moreover, using a nice trick of Gennaro et 
al. which first appeared in [22] and the protocol recently proposed by Catalano 
et al. in [8], the calculation of gcd(p— 1, g — 1) can be performed in a distributed 
way. These methods allow to keep key generation and signature efficient. 

In this paper, we show how to jointly construct RSA moduli such that the 
subgroup Qn is cyclic, which guarantees the existence of discrete logs and of 
generators of Qn. Moreover, the order M of this group does not have small 
prime factors less than some sieving bound B. Checking such primes does not 
exceedingly increase the running time of the key generation algorithm. 



3 Security Model 

3.1 The Network 

We assume a group of I servers connected to a broadcast medium, and that mes- 
sages sent on the communication channel instantly reach every party connected 
to it. 

3.2 The Adversary 

The adversary is computationally bounded and it can corrupt servers at any mo- 
ment by viewing the memories of corrupted servers (passive adversary), and/or 
modifying their behavior (active adversary). The adversary decides on whom 
to corrupt at the start of the protocol (static adversary). We also assume that 
the adversary corrupts no more than t out of i servers throughout the protocol, 
where £ > 2t + 1. 

3.3 Formal Definition 

A RSA threshold signature scheme consists of the four following components : 

— A key generation algorithm takes as input security parameters n,the number 
k of elements to generate Qn, the number £ of signing servers, the threshold 
parameter t and a random string to; it outputs a public key (A, e) where n 
is the size in bits of N, the private keys di, . . . di only known by the correct 
server and for each u G [1, /c] a list Vu,i = Vu‘^^ , . ■ . Vu,e = Vu‘^‘ mod N of 
verification keys. 

— A share signature algorithm takes as input the public key (A, e), an index 
1 < i < £, the private key di and a message m; it outputs a signature share 
Si = x‘^' mod A, where x = H{m) and A(.) is a hash-and-pad function, and 
a proof of its validity proofi (for all u G [1, A:], log„^ = log,,, Si). 
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— A combining algorithm takes as input the public key {N, e), a message m, a 
list si, . . . Si of signature shares, for each u G [1, k] the list Vu,i, ■ • ■ Vu,i of 
verification keys and a list proofi , . . . proofi of validity proofs; it outputs a 
signature s or fails. 

— A verification algorithm takes as input the public key (N,e), a message m, 
a signature s; it outputs a bit b indicating whether the signature is correct 
or not. 



3.4 The Players and the Scenario 

Our game includes the following players : a combiner, a set of £ servers Pi, an 
adversary and users asking signature. All are considered as probabilistic poly- 
nomial time Turing machines. We consider the following scenario : 

— At the initialization phase, the servers use the distributed key generation 
algorithm to create the public, private and verification keys. The public key 
(N,e) and all the verification keys v^s and u^y’s are published and each 
server obtains its share di of the secret key d. 

— To sign a message m, the combiner first forwards m to the servers. Using 

its secret key di and its verification keys for u G each server 

runs the share signature algorithm and outputs a signature share Si together 
with a proof of validity of the share signature proofi. Finally, the combiner 
uses the combining algorithm to generate the signature, provided enough 
signature shares are available and valid. 



3.5 Properties of Threshold Signature Schemes 

The two properties of a t out of £ threshold signature scheme of interest to us are 
robustness and unforgeability. As we already mentioned, robustness guarantees 
that even if up to t malicious players send false signature shares, the scheme still 
returns a correct signature. This property is useful only in the presence of active 
adversaries. 

Unforgeability guarantees that any subset of t+1 players can generate a signa- 
ture s, but disallows the generation by fewer than t players. This unforgeability 
property should hold even if some subset of less than t players are corrupted 
and collude. This property expresses the security of the signature scheme and is 
useful in the presence of passive or active adversary. 



3.6 The Games 

In this section, we describe the security notions for threshold key generation 
and threshold signing protocols. We have to show that the information revealed 
during the key generation and the signing protocols does not release secret in- 
formation to the adversary. 
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Game for the threshold key generation. The correctness of the key gener- 
ation requires that the probability of the secret keys d, p, q, and the public key 
(iV, e) seem be uniformly distributed to the adversary. 

The secrecy of the key generation means that if there exists an adversary A 
which corrupts at most t servers at the beginning of the game, then he cannot 
obtain more information on the secret key held by uncorrupted players. 



Game for the threshold signing protocol. The secrecy of the signing pro- 
tocol means that if there exists an adversary A which corrupt at most t servers 
at the beginning of the game and even if he can obtain signatures on messages 
adaptively chosen, then he cannot forge a signature on a new message. 

4 Enhancing the Boneh-Franklin Scheme to Generate 
RSA Moduli with Special Requirements 

The aim of this section is to generate RSA moduli such that the group of squares 
is a cyclic group whose order has no small prime factors, namely N = pq, p = 
2p' + 1 and q = 2q' + 1, gcd(p', q') = 1 and no primes p < B divises neither p' nor 
q' . In section 6, we will prove that this group can be generated with few random 
elements. Moreover, we use here a sieving method to simultaneously improve 
the key generation protocol, the probability of finding a set of generators of Qn, 
and to make secure Shamir Secret Sharing Scheme. We also present a protocol to 
compute the GCD of a known value and a shared value and prove the robustness 
and the secrecy of this new distributed key generation protocol. 



4.1 A New Distributed RSA Modulus Generation 

In [4] Boneh and Franklin present a protocol for generating a shared RSA mod- 
ulus. We describe this protocol here with our adaptation. 

1. In the first step, each server picks at random two values pi and qt in the 

interval [ according to [39], where n is the size in bits 

of the modulus N . Then, we use a sieving algorithm in order to discard 
Pi + . ■ -+pi and qi + . . ,+qi that have small prime factors and ifpi-|-. . .+pi — l 
or qi + . . . + q^ — 1 have small prime factors. The servers check whether 
gcd(p — 1,4P) = 2 and whether gcd(4P, q — 1) = 2, where P = ri 2 <pi<BP* 
and B is the sieving bound using the GCD algorithm that we describe below. 

2. Then the BGW protocol [2,4] is run to compute the product N of pi-k. . - +pi 
and qi + . . . + qi. They also compute the product p{N) = {p—l){q— 1) and 
check whether gcd{ip{N),N — 1) = 1 using the GCD algorithm. 

3. Next, the parties perform a primality test similar to the Fermat test modulo 
N. The practicality of this test is based on the empirical results of [33] 
where Rivest showed that if a sieving algorithm is first performed, the Miller- 
Rabin primality test is not needed as pseudoprimes are rare according to 
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Pomerance’s conjectures [28,29]. Moreover, carmichael numbers are avoided 
due to a trick similar to Soloway-Strassen primality test. We set p = p\ + 
...+Pi and q = qi + ... + qi. 

4.2 Computing the gcd of a Public Value and a Shared Secret Value 

We briefly recall the protocol presented by Catalano et al. [8] for inverting a 
public value e modulo a shared value p. The basic trick stems from the obser- 
vation that gcd(e, p) = gcd(e, p + Re) where i? is a large integer used to mask 
the shared and secret value ip = + . . . + ipi. Server i chooses a random integer 

], where k' is a security parameter, computes Ci = pi + er^ and 
forwards Ci to all other servers. Each server can compute c = Ci = p + eR 
if we set R = r^. This value can be publicly known and then, all servers can 

compute gcd(e, c) which is equal to gcd(e, p) and u and v such that eu + cv = 1 
when gcd(e, p) = 1. Then, it is easy to show that if we replace chy p + eR, 
we obtain e{u + Rv) + pv = 1. Hence, u + Rv is the inverse of e modulo p. In 
this case, if we note d the inverse of e mod p, each server assigns its share of the 
inverse d to di = vvi and the first server to di = u -I- vri . 

We have presented here the protocol in the honest-but-curious model. But 
this protocol can be made robust following [8]. We can also note that this al- 
gorithm allows to compute the gcd of a known value and a shared one. We call 
this protocol the GCD algorithm. 

4.3 Efficieut Sieving Algorithm Improving the Generation of 
Random Number without Small Factors 

In this subsection, we present the sieving algorithm used in phase 1 of proto- 
col 4.1 and we show how to generate N such that neither p' = nor q' = 
have no small prime factors. Our method uses a new distributed sieving protocol 
designed by Boneh, Malkin and Wu in [5] that we patch in order to create p such 
as neither p, nor p' has a small prime factor less than B. Moreover, we show how 
to withstand malicious adversaries. We denote by P the product of all odd small 
primes up to H. 

1. Each server picks a random integer Oi in the range [1,H] such that Oi is 
relatively prime to P. Then, since each Oj is a random integer relatively 
prime to P, their product a = ai x . . . x mod P is also relatively prime 
to P. 

2. The servers perform a protocol to convert the multiplicative sharing of a to 
an additive sharing of a = 6i -I- ... -I- 6^ using the BGW protocol. 

3. Each server picks a random n G [[ ^'^p J , \ [ sets pi = riP+bi. 

Clearly, p = ^pi = a mod P and hence p is not divisible by any prime 
smaller than B. We can note that p = RP + a where R = This sieve 

works only for B s.t. P < p. We can increase the bound B to Bi by checking 
whether gcd(P',p) = 1 where P' = riB<p<Bi P thanks to the GCD algorithm. 
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In order to also prove that p' = has no prime factors less than B, one has to 
check whether gcd(p— 1, P) = gcd(2p', P) = I and gcd(p— 1, 4P) = 2 to test the 
power of 2. If we denote P' by 4P, we can perform a single test gcd(p— 1, P') = 2. 

To distribute this test in the honest-but-curious model, the first server sets its 
parts Pi to Pi — 1 and we use the distributed GCD protocol described in section 
4.2. It is also possible to make this test robust in presence of malicious players 
as it is explained in appendix 9.1. 

Finally, the protocol that transforms the multiplicative sharing of a into an 
additive sharing can also be made robust as it uses the BGW protocol. This 
transformation calls £ times the BGW protocol. At the beginning, 6^ 0 = 0 for all 
i G {0, . . . ,£}. Then, for i = 1 to £, Ui = and Uj = 0 for Vj yf i, and the BGW 
protocol performs (6i_*_i + . . . + xoi = + . . . + be^i-i){ui + . . .+ue) = 

bi^i + . . . + bi^i- 

Theorem 1. The key generation protocol of Boneh- Franklin and the sieving 
protocol allow to generate RSA moduli such that the order M of the group Qn 
does not contain small prime factors less than B. 

It is obvious to see that the use of the sieving method to guess pfs and qfs 
allows to improve the first step of Boneh-Franklin’s protocol and speeds up the 
running time of this algorithm since this avoids many rewindings in phase 3 of 
Boneh-Franklin. Moreover, this sieving protocol is adapted to take into account 
the small factors in the factorization of p' and qh 



4.4 The Key Generation of N Such That Qn Is Cyclic 

Here, we show how to generate N such that the group Qn is cyclic. To guarantee 
this property, we use the fact that the product of two cyclic groups which orders 
are coprime is a cyclic group. The following lemma and the GGD protocol enables 
to check that p' and q' are coprime in a distributed way. First we prove a lemma 
which has been used in another form in [22]. 

Lemma 1. Let N = pq an RSA modulus, gcd(p — l,q — 1)| gcd(iV — 1, p{N)) 
and the square free part of gcd{N — 1, p{N)) divides gcd(p — 1, g — 1). 

See appendix section 9.2 for a proof of this lemma. 

Corollary 1. If gcd{N — 1, (p{N)) = 2, then gcd(p — l,q — 1) = 2. 

Proof. If gcd(A^ — l,ip{N)) = 2, then as gcd(p — l,q — I)|gcd(A^ — l,(p{N)), 
gcd{p — l,q — I) = 2 since gcd(p — l,q — I) cannot be equal to 1. These last 
verification can be jointly done using the GGD algorithm described in section 4.2 

□ 

Theorem 2. The key generation protocol of Boneh-Franklin and the GCD pro- 
tocol allow to generate RSA moduli such that the group Qn is cyclic of order 
M = p'q' , where N = pq, p = 2p' -\- 1, q = 2q' + 1 and neither p' nor q' have 
prime factors smaller than B. The iteration number of this protocol with respect 
to the Boneh-Franklin protocol is on average 4 x e^\n.{B). 
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Proof. Following section 4.3 and corollaryl, we can assume that we get an RSA 
modulus such that p — 1 and q — I have all theirs prime divisors greater than B, 
they do not have common divisors, i.e., gcd{p — l,q — 1) = 2 and and 
do not have small prime factors. As the product of cyclic groups whose order 
are coprime is a cyclic group, the groups of squares in Zp* and in Zq* are cyclic, 
and so the group Qn is also cyclic. This allows to guarantee that there exists a 
cyclic subgroup in Zjy* of order M = p' q' . 

We can estimate the iteration number of this algorithm with respect to 
phase 1 of the Boneh-Franklin protocol. First, it is a well-known fact that 
lim„^ooPr(p',q')6[i,n[2[gcd(p',gr') = 1] = ;^ > 1/2 assuming prime numbers 
in are uniformly distributed. Moreover, the only slowing factor at 

the key generation is the check that gcd(P',p — 1) = 2, where P' = 4P. We 
can note that Prp/ [gcd(2p', P') = 2] = Prp/[2 \ p' A 3 \ p' A ... A B \ p'] = 
(1 - g)(l - |) ■ ■ ■ (1 - p) = rip,<ij(l - ~ Pnks) according to the second 

theorem of Mertens, where 7 is the Euler constant. Therefore, we have to run 
this algorithm 2 x (e^ln(P) -|- e'*'ln(P)) on average in order to get such RSA 
moduli. □ 



4.5 Proofs of Security and Robustness 

For security and robustness proofs see [4,18,8] or the extended version. 



4.6 Distributed Generation of the Keys in Shoup’s Protocol 

Once N is generated, let prime e be the first prime greater than 4Z\^ so that each 
server can compute it. Then, Catalano et al. protocol’s [8] is run to generate a 
shared secret key d in a distributed manner. At the end of the protocol, each 
server can compute its verification keys as for random computed 

as yf mod N where is the concatenation of P{{N\\i) for sufficiently many f’s 
to get the correct security parameter in the random oracle model. 



5 Security of Shoup Protocol against Passive Adversary 

In this part we show that Shoup’s protocol is secure with the RSA moduli 
generated in previous section against static and passive adversary. 



5.1 Key Generation 

At the end of the key generation protocol, we want to know if the information 
an adversary can collect, helps him to get useful information on the secret key 
d. Let di^,. . . ,di^, t shares of the secret key obtained by the adversary from the 
t correupted servers. 
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Information revealed by the shares di^. For each di^, we can note that 

dij = f{ij) = oq + + ■ • • + Ottj mod M 

If we add a (t + 1)*^ equation, oq = d, we obtain the following linear system : 

Qq ciiii “t“ . ■ . “t“ — di-^ mod J\d 

ao + aih + . . . + 0 ( 12 * = di^ mod M 

< 

oq + oii( + . . . + 0 (it* = djj mod M 
oq “t“ 0 -t“ . ■ . ~t“ 0 = d 



or with matrix I. A = D mod M 



4 



1 i( z? . . . 

V 1 0 0 ... 0 / 



/ 1 ii i\ ... 

1 Z2 4 ... 



£ ao~\ 
ai 




(d^A 

di2 


\atj 




d%t 

\d) 



mod M 



The matrix / is a Vandermonde matrix. The determinant of such matrix is 
det(/) = ni<j<fc<(+i(*fe ~ 4) -^5 where it+i = 0. As the z^’s are distinct 

in Zm, ik “44® mod M since £ < B. Hence, each factor (z^ — ij) is invertible 
modulo M, so det(/) is invertible modulo M. Therefore, all values of d are 
possible. Hence a group of t players cannot get information on d from shares 
of d. We see here that the sieving algorithm performed is important to avoid 
leakage of information on d. Consequently, £ < B. 



Information revealed from the verification keys. For all u G the 

verification keys u^y’s of non-corrupted servers do not reveal any information 
as they can easily be simulated from the parts of the corrupted servers. The 
simulator chooses at random G Z(v*, computes Vu = mod N. Hence, 
= Vu^- We can note that Adi = ^fk^j niod ip{N) if we denote by 

S a group of t + 1 values and if we define the Lagrange coefficients as Xfj = 
^ ^ rij'6S\j G The simulator can then compute for all u G [1, fc], = 

2AX? f 

Vu X riy=i ^ niod N, where S = {0, zi, . . . , Zt}- Hence a group of t 

players cannot get information from the validation keys of non-corrupted servers. 



5.2 Signing Protocol 



To generate a signature on message m, each server z computes x = H{m), 
Si = mod N and sends Sj to the combiner without any proof because 

we are in the honest-but-curious model. The combiner selects a set S' of t -I- 1 

2A*^ 

mod N. It then follows that zc® = as 



values and computes w = Hiii 
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. Applying a well-known lemma, we can extract e-root of x from w 

which is a e-root of as 4Z\^ is a known value using the extended Euclidean 
algorithm on e and 4A^. As we are in the honest-but-curious model, the signature 
s is always correct. 

We can prove the following theorem : 

Theorem 3. In the random oracle model, the signing protocol described above is 
a secure threshold signature scheme (non-forgeable) assuming the standard RSA 
signature scheme is secure. 

Proof. Similar to [37]. □ 

6 Enhancing the Shonp Scheme against Active Adversary 

The aim of this section is to revisit the proof of correctness originally designed 
by Shoup to cover the case of RSA moduli generated as in the section 4. This 
uses a method by which we generate the entire group of squares with few random 
elements with high probability. 

6.1 Proof of Correctness 

Let be a modulus such that N = pq and p = 2p' + 1 and q = 2q' + 1 where 
p' and q' have no small prime factors and gcd(p — 1, q — 1) = 2. Accordingly 
Qn is cyclic and there exists a generator g in Q^. Thus, the discrete log of any 
element sf^ in basis g exists, where Si = and Z\ = f!. As we will see in 

section 6.3, we can denote by ui, . . . ,Vk a /c-tuple of random elements in {Qn)^ 
such that with high probability, this tuple generates the whole group Qn of 
order M = p'g', i.e. for each x G Qn, there exists (oi,... , Ofc) G [0,M[* such 
that X = rifci N. 

Each server i has a /c-tuple of verification keys f ly = mod N, . . . , Vk,i = 

Vk^' mod N . He computes a signature share, Si = mod N, where di is the 
ith signature share of d and proves that 

log^i(^i.i) = ■ • ■ = log„,(ufey) = log,,4/i(s*^)(= d, mod M) 

The value is a square and is an element of Qn. 

Now, we describe the proof of “correctness” and still let di G [0, M[ be 
the secret share of a server, and A and B' two integers such that log(A) > 
log{B' Mh) + k 2 where B' and k 2 are security parameters and h is the number 
of rounds. Finally, ki is a parameter such that the cheating probability 1/B' is 
< 1/2^L Whereas security parameter ki controls the completeness and statistical 
zero-knowledge results, security parameter k2 controls the soundness result. We 
present the protocol for one round {h = 1). 

The prover chooses a random r in [0,A[. Then, he computes t = 
(u'l,... ,v'k,x') = ,vl,x^"^''). Let e be the first 6' = log(R') — 1 bits 

of the hash value 

e= [H{vi,... ,Vk,x‘^^,vi^i,... ,Vk,i,Si‘^,v'i,... ,v'k,x')]b' 
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if we denote by [x\b' the first b’ bits of x. Next, the prover calculates 2 ; where 
z = r + edi- The proof is the pair (e, z) G [0, B'[x [0, A[. To check it, the verifier 
has to compute whether 

rrr/ 2 z —e z —e 4Z\2 — 2e\i 

e=[H{vi,... ,Vk,x . . ,Vk,i,Si ,vi ,...,VkVk,i ,x Si )\b' 

and verify whether 0 < z < A. 

6.2 Security Analysis of the Proof of Correctness 
Proof of Completeness 

Theorem 4. The execution of the protocol between a prover who knows the 
secret di and a verifier is successful with overwhelming probability if B'Mh/A is 
negligible where h is the number of rounds. 

Proof. If the prover knows a secret di G [0, M[ and follows the protocol, he fails 
only if some z > A. For any value x G [0, M[ the probability of failure of such 
event taken over all possible choices of r is smaller than B'M/A. Consequently 
the execution of the proof is successful with probability > ( 1 — ) ^ > 1 — ^ • 

□ 

Proof of Soundness. Let us focus on soundness of the interactive proof system. 

Lemma 2. If the verifier accepts the proof with probability > l/B'+e where e is 
a non-negligible quantity, then using the prover as a “black-box” it is possible to 
compute a and r such that |cr| < A and |r| < B' such that = vi^A, . . . , Vk'^ = 

T AAa _ 2 t 

5 

Proof. If we rewind the adversary and get two valid proofs for the same commit- 
ment t, (e, z) and (e', z'), we have for u = 1, . . . ,k i.e. for all verification keys, 
Vu' = Vu^v~\ = Vu^ v~\ . So, we obtain Vu'^ = Vu,A mod N if we set cr = z' — 2 : 
and T = e — e'. Therefore we can write vi^ = v\_A 1 • • • > = Vk,A , = sf^'”. 

□ 

Theorem 5. (Soundness) Assume that some probabilistic polynomial Turing 
machine P is accepted with non-negligible probability. If B' < B, hx log(i?') = 
0{ki), k = 0{ki/ log{B)) and log(A) is a polynomial in ki and log(A), we can 
prove that = sf^ and so Si is a correct signature share. 

Proof. By the previous lemma we can assume that we have r and a such that 
Vu'^ = Vu‘^'^ for u = 1, . . . ,k and = sf^'” . 

Then, we can write x'^^ with the set of generators of Qn since it is a square : 

X'^‘^ = X ... X Vk^’‘. 

Consequently if we raise this equation to the power a, we obtain = 

X ... X But, x'^^^ is equal to and x . . . x is equal 

to X ... X as = Vu'^'” for M = 1, . . . ,k. 

Therefore, sf^'” = with |r| < B' . We can simplify this equation by 

T if r is coprime with p'q' . So we obtain if B' < B. 
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Let n{ki) the probability of success of P. If Tr(fci) is non-negligible, there 
exists an integer c such that 7r(fci) > l/ki'^ for infinitely many values k\. The 
probability for P to generate a correct signature share while the ViS generate 
the group Qn is larger than Tr(fci) — 2 x x according to the result of 

the section 6.3. So, if A: = 0(ki/log(P)), for infinitely many values ki, 2 x x 
< l/3fci^ 

Furthermore, for ki large enough, \jB'^ < l/3fci° if h x log(B') = 0(ki). So 
by taking £ = 7r(fci)/3 in lemma 2 we conclude that it is possible to obtain (cr, r) 
in polynomial time 0(l/£) = 0(fci°). □ 

Proof of Statistical Zero- Knowledge 

Proof. Furthermore, we can prove that if A is much larger than B' x N, the 
protocol statistically gives no information about the secret. In the random oracle 
model where the attacker has a full control of the values returned by the hash 
function H , we define the first h' bits of the value of H at 

(ui, . . . , U/e, X , . . . , Vk.ij Si , V\ V\^i , . . . , Ufc Vk.i jX Si ) 

to be e. With overwhelming probability, the attacker has not yet defined the 
random oracle at this point so the adversary A cannot detect the fraud. □ 

6.3 Choice of Parameters 

In this section we prove that with high probability we generate the entire square 
group Qn with only few random elements. 

Theorem 6. With probability greater than 1 — 2 x x , a random k-tuple 
{vi,... ,Vk) generates Qn. 

See appendix section 9.3 for a proof of this theorem. 

7 Practical Parameters for the Scheme 

In the key generation we can test whether p, q, p' and q' are divisible by 
small primes < B and gcd(p', q') = 1. We can assume that B is the first 
prime greater than 2^®. The loss in the key generation phase is a factor 80 
on average. Indeed, we have seen in the proof of theorem 2 section 4.4 that 
Pr pi [p' has no small prime factors < B] « e-y in(B) ■ 

assume that Prp'[p' has no small prime factors < B] > Therefore, to gener- 
ate p and q such that neither p' nor q' have small prime factors and such that 
gcd(p— 1, g — 1) = 2, we have to run on average 2 x (20-1-20) = 80 times the first 
phase of Boneh-Franklin’s protocol. This factor is not critical as this algorithm 
is run only once. 

In the proof of correctness, if we want to have a security parameter of 2®®, 
we choose B' = 2^® < B. Hence, we have to choose h = 5 rounds. To generate 
the group of squares with probability greater than 1 — 2®®, we need u = 6 
verification keys. Therefore, we need 30 proofs of correctness but is is acceptable 
in the applications that we have in mind. 
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8 Conclusion 

In this paper, we have showed how to avoid safe prime RSA modulus in Shoup’s 
proof of robustness such that the proof remains correct. We consider environ- 
ments where high security is required such as electronic voting schemes, and 
therefore, we need protocols using standard assumptions and we are ready to 
pay the price for it. 

Basically, we use three different techniques allowing to prove that : 

— the group of square is cyclic, 

— we generate p and q such that p' and q' do not contain small prime factors, 
which allows us to generate the group Qn and make Shamir Secret Sharing 
Scheme secure 

— we generate a set of generators of Qn by picking at random different gener- 
ators in Qn- 

Finally, we show how to adapt Shoup proof in order to work with different 
elements that generate Qn instead of a single one. 
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9 Appendix 

9.1 Robustness of the Distributed Sieving Protocol 

To resist against such players, we first run a “sum-to-poly” algorithm as de- 
scribed in [16,18]. When the polynomial sharing of p is obtained, one can note 
that X)j6S\{o} ^ 0,3 ~ where Aq^ denotes the Lagrange coefficient of the jth 
server. Therefore, server i can set its new polynomial share to f{i) — 1 if /(/) 
denotes its polynomial share. Indeed, if p = /(O) = X)j6S\{o} then 

p-i = /(o)-i= ^ xljfij)- E 

jesuo} ies\{o} jes\{o} 

Next, the GCD protocol can also be applied with a polynomial sharing of the 
secret value tp = p — 1 . 

9.2 Proof of Lemma 1 

Proof. We can note that = N —p — q+l = {N — 1) — (p — 1) — (q — 1). So, 
(iV-l)-p(iV) = (p-l) + ((z-l) 

Consequently, gcd{N—l, ip{N)) = gcd{{N—l) — (p{N),ip{N)) = gcd{{p—l) + {q— 
1), ip{N)). If we note a = p — 1 and b = q — 1, we have to compare gcd(a -I- b, ab) 
and gcd(a, b). It is easy to see that gcd(a, 5)| gcd(a-|-6, ab), because if /[ gcd(a, b), 
f\a and f\b, so /|a + b and f\ab. 

But let /I gcd(a + b, ab). As, 

gcd(a -I- b, ab) = gcd(a + b,ab — a{a + b)) = gcd(a + b, —a'^) = gcd(a + b, a^) 

We can assume that /|(a + b) and /|a^. If / is a prime number, /|a and as 
/|(a + b), f\gcd{a,b). If / is not a prime number but a power of some prime 
number, say / = we have and a = 2/3. Hence, f^\a + b and f'^\a, 

so r^\gcd{a,b). □ 
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9.3 Proof of Theorem 6 



Theorem 5. With probability greater than 1 — 2 x x , a, random k-tuple 
{vi,... ,Vk) generates Qm- 

Let us first define additional notations. If {v\, . . . ,Vk) is a fc-tuple of (Qn)^, 
we use {vi, . . . , Vk) to denote the subgroup of Qn that is generated by the v^s, 
i.e., {vi, . . . ,Vk) = {x e (5iv|3(Ai , ... ,Xk) x = OLi mod N}. 

We also denote by C(fc) the Riemann Zeta function defined by C(fc) = 
^ for any integer k > 2. li n = gi®! x ( 72 ®^ x ... x ( 7 ^®^, we denote 
by ipk{n) n'" X (1 - ^)(1 - ... (1 - ^), the generalization of the Euler 

function in the case of k generators. Finally, if n has no prime factors less than 
B, we define Cs(fc) has J2d=B ilk- 

To find a generator v of Qn, we have to find a v such that v mod p generates 
Qp and v mod q generates Qg. 

We estimate the probability that x € Qn is a generator of Qn. The proba- 
bility to catch such number depends on the factorization of the order p' of Qp 
and q'. Yet, even if M = p'q' has no small factors, the probability is to obtain 
such generator is not overwhelming. Indeed, if we pick a random element v in 
Qp, the probability that u is a generator of Qp is 



Pr = Pr ((u) = Qp) 

vG Qp 



p' 



n 



1 

Pi 



and if p\ < 2B, we can bound the probability by < 1 — The probability 
that B < Pi < 2B is equal to the probability that p' is divisible by at least one 
prime that belongs in [B,2B]. So, Pr^jR < Pi < 2B] = T.B<q,< 2 B,q, prime ji > 
^ X (7t(2B) — 7t(B)) if we denote by 7t(x) the number of primes between 2 and 
X. If B = 2^®, with probability > 1/26, Pr < ^. Consequently, we cannot say 
that this probability is overwhelming. 

However, if we allow to choose several random elements in Qn, then the sub- 
group (vi, . . . , Vk) is a equal to Qn with high probability. A fc-tuple (ui, . . . , Vk) 
is a set of generators of Qn if (ui mod p, . . . ,Vk mod p) is a set of generators 
of Qp and if (ui mod q, . . . ,Vk mod q) is a set of generators of Qq. Hence, the 
number of fc-tuples of (Qat)* that generate Qn is the number of these fc-tuples 
viewed as elements of (Qp)^ that generate Qp and viewed as elements of (Qq)^ 
that generate Qq. 

There are p' = elements in Qp. To generate this cyclic subgroup of Zp* 
(since it is a subgroup of a cyclic group), there are (p{p') such generators. 

The analysis made by Poupard and Stern in [31] can be extended in our 
context as it is true in general cyclic groups and not only in Z^e*. Let us now 
present a preliminary lemma. 

Lemma 3. The number of k-tuples of {Qp)^ that generate Qp is (fk{p'). 

Proof. Let (ui,... ,Vk) be /c-tuple of (Qp)^ and u be a generator of Qp] for 
i = 1, . . . ,k, we define G Zp/ by the relation = Vi mod p. We first notice 
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that (rii, . . . ,Vk) generates Qp if and only if the ideal generated by a\, . . . ,ak 
in the ring Zp/ is the entire ring. Bezout equality shows that this event occurs 
iff gcd(«i, . . . ,ak,p') = 1. 

Now, we count the /c-tuples (oi, . . . , ak) € (Qp)^ such that gcd(ai, . . . , ak,p') = 
1. Let rii=i the prime factorization ofp'. Then, gcd(a;, = 1 

V* < t', gcd(x, = 1 Vi < t', gcd(x mod = 1. 

Using the Chinese remainder theorem, the problem reduces to count- 
ing the number of fc-tuples (/3i,... ,/3fc) of (Z^.^)^ such that gcd(/3i mod 
qif%... , /3fc mod g./i) = 1 for z = 1,... ,C. The /c-tuples that do not 
verify this relation for a fixed index i are of the form (qiji, . . . ,qijk) where 
(71 , . . . ,7fc) e Z^.7-1^ and there are exactly such fc-tuples. 

Finally, the number of fc-tuples of (Zp/)^ such that gcd(ai,... ,ak,p') = 1 is 
== nLi ‘fkiqi^') = Pk{p') since Pk is multiplicative. □ 

Now, we return back to the proof of the theorem 6. Let us first introduce 
a notation : for any integer x, let be the set of the indices i such that Pi 

is a factor of x. From the previous lemma, we know that the probability for a 
fc-tuple of (<5p)^ to generate Qp is ^ . Lemma 3 shows that pr is equal to 
the product Hies , ^ ~ The inverse of each term 1 — ^ can be expanded 
in power series : (1 — The probability pr is a product 

of series with positive terms, pr = (OieSp, =0 p.a^fe ) ^ so we can distribute 
terms and obtain that pr^^ is the sum of 1/d^ where d ranges over integers whose 
prime factors are among piS, i G Spi. This sum is smaller than the unrestricted 
sum = C{k). Finally, we obtain pr > l/^(fc). 

In our case, neither p' nor g' have prime factors less than B, therefore : 
1 + Cs(fc) = 1 + YT=b 1/^^ < 1 + -I- /“ dx/x’" = 1-1- '^~kY^ ^ m- 

for all X > -1, 1/(1 -I- a;) > 1 - X, 1/(1 -I- Cs(fe)) > 1 - ^~kYi^ ^ W' 

Therefore, the number of fc-tuples of (<5p)^ that generate Qp is Pk{p') and 






Qp} — 



Pkjp') 

n'k 



1 + Cs(fc) 



> 1 - 



fc -I- 13 — 1 
fc- 1 



X 



1 

Bk 



Consequently, with probability greater than 1 — 2 x x , the fc-tuple 
(ui,... ,Vk) generates Qp and Qq and therefore Qn. For example, with fc = 6 
and B = 2^®, this probability is larger than 1 — 1/2®®. 
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Abstract. Threshold cryptosystems and signature schemes give ways 
to distribute trust throughout a group and increase the availability of 
cryptographic systems. A standard approach in designing these protocols 
is to base them upon existing single-server systems having the desired 
properties. 

Two recent (single-server) signature schemes, one due to Gennaro et 
ah, the other to Cramer and Shoup, have been developed which are prov- 
ably secure using only standard number-theoretic hardness assumptions. 
Catalano et al. proposed a statically secure threshold implementation 
of these schemes. We improve their protocol to make it secure against 
an adaptive adversary, thus providing a threshold signature scheme with 
stronger security properties than any previously known. 

As a tool, we also develop an adaptively secure, erasure-free threshold 
version of the Paillier cryptosystem. 



1 Introduction 

The goal of threshold cryptography [14,15] is to enable a cluster of cooperating 
servers to securely and efficiently implement such cryptographic tasks as sign- 
ing and decrypting. A threshold cryptographic system should remain functional 
and secure even when a fraction (say, almost one half) of the servers become 
malicious. This problem is well-motivated in practice. 

One of the strongest adversarial models in this setting is the so-called adap- 
tive erasure-free model. In this setting (1) the adversary corrupts servers over 
time depending on its entire view of the computation; and (2) upon becoming 
corrupted, the players have to hand over to the adversary their entire computa- 
tion history; i.e., nothing can be erased. 

Although results in general multi-party computation guarantee feasibility [6, 
5,10], they cannot be directly applied without incurring a considerable computa- 
tion penalty. In contrast, threshold protocols are tailor-made for a specific task 
at hand and are therefore much more practical. 

Securing threshold cryptographic systems against adaptive attacks has been 
the subject of extensive recent research [7,17,21]. Erasure- free solutions have also 
been considered [21]. However, none of these papers considered the question of 
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constructing adaptively secure threshold versions of signature schemes provably 
secure against adaptive chosen message attacks [18,12]. 

On the other hand, statically secure threshold versions of the Gennaro et 
al. [18] and Cramer-Shoup [12] signature schemes has been proposed by Cata- 
lano et al. [8]. Unlike the adaptive adversary, the static adversary’s corruption 
strategy is independent of the computation history and can be assumed to be 
fixed in advance. It is known that statically secure protocols are not necessarily 
adaptively secure [6,5,10]. While Catalano et al. suggest that it is possible to 
turn their statically-secure solution into an adaptively secure one, they do not 
give an explicit construction. 

In this paper, we extend the protocol of Catalano et al. and obtain the first 
construction of erasure-free adaptively secure threshold version of the Cramer- 
Shoup signature scheme. Our results apply as well to the Gennaro et al. [18] 
signature scheme. Practical threshold signature schemes with this level of secu- 
rity have not been exhibited before. 

The general structure of our results. The protocol of Catalano et al. is con- 
structed as follows: first, they give a secure protocol for the honest-but-curious 
adversary, i.e. the adversary whose corruption strategy may be adaptive, but 
who cannot force the players he controls to deviate from their respective pro- 
tocols. Then, they show how to secure it against an adversary that forces the 
players to deviate from their protocols arbitrarily. In this second step, they are 
only able to exhibit static security. 

Our starting point is the honest-but-curious protocol of Catalano et al. In 
order to make it adaptively secure in the erasure-free model, we utilize the tech- 
niques due to Cramer et al. [11]. 

Cramer et al. show how, given a threshold cryptosystem with certain desir- 
able properties, to securely (in the static model) compute any arithmetic circuit. 
The general structure of their construction is as follows: the inputs to the circuit 
are given in encrypted form. The circuit is evaluated gate-by-gate. To evaluate 
each gate, the players run a corresponding protocol. For example, for the multi- 
plication gate, the players run a special protocol that, on input two ciphertexts 
E{a) and E{h), produces a ciphertext E{ab). Once all the gates are evaluated, 
the players jointly decrypt the ciphertext that corresponds to the output of the 
last gate. Cramer et al. [11] provide a statically secure multiplication protocol 
and a statically secure threshold cryptosystem with the required properties. 

We extend the results of Cramer et al. in two ways: (1) we observe that their 
multiplication protocol is adaptively secure under a weaker definition that al- 
lows probability of failure, provided that the underlying threshold cryptosystem 
is adaptively secure; and (2) we exhibit an adaptively secure threshold cryptosys- 
tem with the required properties, namely, we show how to secure the Fouque et 
al. [16] version of the Paillier cryptosystem [24] against the adaptive adversary. 

We then plug the resulting adaptively secure multiplication protocol into a 
slight modification of the honest-but-curious protocol of Catalano et al. 

This approach has more general applications. It is intuitively simpler to con- 
struct protocols secure against the honest-but-curious adversary than ones se- 
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cure against the active adversary. In fact the first results in secure multi-party 
computation were of this flavor [19]. Guided by our example, one can convert a 
protocol for the honest-but-curious case into one that is secure against an ac- 
tive and adaptive adversary in the erasure-free model, at only a small cost in 
efficiency. 



2 The Adaptive Adversary Model 

In this paper, we use a standard model [7] to describe the execution of proto- 
cols and the capabilities of the adversary. We assume the existence of I parties 
communicating over a synchronous broadcast channel with rushing, where up to 
a threshold t < 1/2 of them may be corrupted. The value k will represent the 
security parameter. 

A t-limited adaptive adversary may choose to corrupt any party at any point 
over the lifetime of the system, as long as it does not corrupt more than t parties 
in total. The choices may be based on everything the adversary has seen up to 
that point (all broadcast messages and the computation histories of all other 
corrupted parties). When an adaptive adversary corrupts a party, it is given 
the entire computation history of that party and takes control of its actions for 
the life of the system. Note that this prohibits the honest parties from making 
erasures of their internal states at any time. 

Summary of definitions and techniques. As expected, security of a protocol 
is defined in the adaptive model using the simulation paradigm. For any adaptive 
adversary A, there must exist a simulator S which interacts with A to provide a 
view which is computationally indistinguishable from the adversary’s view of the 
real protocol. The main difficulty in designing secure protocols in the adaptive 
model is in being able to fake the messages of the honest parties such that there 
are consistent internal states that can be supplied to the adversary when it 
chooses to corrupt new parties. In fact, we will design the protocols such that 
the simulator can supply consistent states on behalf of all honest parties except 
one, which we call the “single inconsistent party,” [7] and denote Pg. We stress 
that the inconsistent party is chosen at random by the global simulator, and 
remains the same throughout all simulator subroutines. 

We will design simulators that supply a suitable view to the adversary pro- 
vided Ps is not corrupted, or said another way, the adversary’s view will be 
indistinguishable from a real invocation up to the point at which Ps is corrupted 
(if ever). Of course, if Ps does become corrupted, then we may assume that 
the adversary detects the simulation perfectly; we call the probability of this 
event the error of the protocol. In order to make a reduction from a single- 
server signature scheme to a threshold version, we will require that the error be 
non-negligibly smaller than one. In particular, because 2t < I, the error of our 
protocols will be less than 1/2. 

In all of our protocols, any deviation that is detectable by all honest parties 
will cause the misbehaving party to be excluded from all protocols for the life of 
the system. Upon detecting a dishonest party, the others restart only the current 
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protocol from the beginning. Intuitively, this strategy prevents an adversary from 
gaining some advantage by failing to open its commitments after witnessing the 
honest parties’ behavior. This rule will apply in each round of every protocol, 
even when not stated explicitly. In our simulators, we will be explicit about when 
misbehaving parties cause the protocol to be restarted, and when they cause the 
adversary to be rewound. 

A note about the round-efficiency of this rule: the number of rounds of a single 
protocol execution is bounded only by a constant multiple of the threshold t 
(since one corrupt party may force a restart every time). However, the adversary 
can force a total of only 0{t) extra rounds to be executed over all invocations, 
which is a negligible amortized cost over the life of the system. (This assumes 
that all protocols are constant-round when no malicious parties are present, 
which will be the case.) 



3 Tools 

In this section we address a special kind of zero-knowledge proof called a, S- 
protocol [11]. First we summarize A-protocols in the two-party setting, then 
we demonstrate how to implement them in a multiparty setting using trapdoor 
commitments. A reader familiar with the work of Cramer et al. [11] can skip to 
section 4. 

Two-party If-protocols. The two-party A-protocols we use here are, in sum- 
mary, honest-verifier perfect zero-knowledge proofs of knowledge with perfect 
completeness in which the knowledge extractor needs only two different conver- 
sation in order to extract a witness. We refer the reader to Cramer et al. [11] for 
formal definitions. 

Trapdoor commitments. A trapdoor commitment scheme is much like a reg- 
ular commitment scheme: a party P commits to a value by running some proba- 
bilistic algorithm on the value. The commitment gives no information about the 
committed value. At some later stage, P opens the commitment by revealing 
the committed value and the random coins used by the commitment algorithm. 
P must not be able to find a different value (and corresponding random string) 
that would yield the same commitment. 

Trapdoor commitment schemes have one additional property: there exists 
a trapdoor value which allows P to construct commitments that he can open 
arbitrarily, such that this cheating is not detectable. Cramer et al. [11] provide 
a formal definition. 

Multiparty A-protocols. The goal of a multiparty A-protocol is for many 
parties to make claims of knowledge such that all parties will be convinced. 
If all players are honest-but-curious, a naive way of achieving this goal is to 
make each prover participate in a separate (two-party) A-protocol with each 
of the other players. However, this approach incurs significant communication 
overhead, and it is not secure against an active adversary, since A-protocols are 
only honest- verifier zero- knowledge. 
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Part of an efficient multiparty A'-protocol involves choosing a shared fc-bit 
challenge string, though no particular distribution is required^. We simply need 
two different invocations to generate different challenge strings, except with 
negligible probability. The challenges are generated as follows: in a preprocess- 
ing phase, generate a key for a collision-resistant hash function from {0, 1}^ to 
{0,1}*. To generate a challenge, each party contributes one random bit, then 
the hash function is applied to the concatentation of these bits. If two identical 
challenges are created, then either the inputs to the hash function were identical 
(which happens with probability at most 1/2*/^ since at least half the parties 
are honest), or a collision in the hash function is found. 

The complete description of a multiparty if-protocol is as follows: in a pre- 
processing phase, a public key ki for a trapdoor commitment scheme is generated 
for each Pi, and is distributed to all the parties by a key-distribution protocol 
which hides the trapdoor values. In a single proof phase, some subset P' of 
parties contains the parties who are to prove knowledge. 

1. Each Pi G P' computes Oi, the first message of the two-party 27-protocol. It 
then broadcasts a commitment Ci = C{ai, Vi, ki), where ri is chosen randomly 
by Pi- 

2. A challenge r is generated by the parties, as described above. This single 
challenge will be used by all the provers. 

3. Each Pi G P' computes the answer Zi to the challenge r, and broadcasts 

ai,Ti,Zi. 

4. Every party can check every proof by verifying that Ci = C{ai,ri,ki) and 
that {ai,r,Zi) is an accepting conversation in the two-party A'-protocol. 

Cramer et al. [11] prove the security of this protocol against a static adversary. 
We have shows that it is also secure in the adaptive setting, using the single 
inconsistent party technique. We refer the reader to the full version [22] of this 
paper for the proof. 



4 Threshold Signatures Using a Threshold Cryptosystem 

Suppose an adaptive-chosen-message-secure signature scheme (such as Cramer- 
Shoup [12]) and a semantically secure cryptosystem (such as Paillier [24]) are 
given. Our signature scheme will be constructed as follows: besides the key pair 
{PK, SK) for a secure signature algorithm, the key generation algorithm also 
generates the key pair {E, D) for some semantically secure cryptosystem. The 
public key for the resulting signature scheme will be {PK, E, E{SK)), while 
the secret key is simply the secret key of the underlying signature scheme, i.e., 
SK. The signature and verification algorithms are the same as in the signature 
scheme given. It is easy to see, by a hybrid argument, that this resulting signature 
scheme is secure against the adaptive chosen message attack. 

In the following sections, we will describe secure protocols for key generation 
and signing, and give proofs of security for these protocols. 

Thanks to an anonymous referee for suggesting this improvement, due to Nielsen [23]. 
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4.1 Key Generation 

Recall the Cramer-Shoup signature scheme. The public key of the signer is a 
tuple {N, h, X, e', H), where: N = pq is an RSA modulus such that p = 2p' + 1, 
q = 2g' + 1, and p' ,q' are both primes (p and q are called safe primes); the values 
h^x S Z* are both quadratic residues modulo N; e' is a random prime number; 
H is a, collision-resistant hash function. The signature on a message m is a tuple 
(e,y',y) such that: e yf e' is a random prime number; y' is a random quadratic 

residue modulo N; y® = xh^^^ ^ where x' = -pwjyn) mod N. 

The key generation algorithm for the Cramer-Shoup cryptosystem generates 
a public key PK = {N, h, x, e' , H) and a secret key SK = 4>{N). Our public key 
will also include a Paillier public key (g, n) (where n is a product of two safe 
primes, and g has order n modulo nf), and a ciphertext E{(j){N)) = mod 

nf for a random 1 < r < n^. 

Our key generation protocol will not be efficient, but since it is only carried 
out once, this efficiency penalty can be ignored. It will proceed in two steps. In 
Step 1, the parties will run a general multi-party computation to generate the 
public key {PK, E, E{SK)), as follows: each party Pi will contribute a random 
string Xi. The resulting key will be computed using the circuit for single-server 
key generation with randomness obtained by the exclusive-or of the r^s: R = 
The inputs to Step 2 are the values {xi} (i.e., the coins from Step 1) and 
fresh random bits |r'}. Then, using general MFC, the parties will compute the 
auxiliary information, emulating the one-server algorithm provided in section 5.2. 
For the underlying general MFC we can use the protocol due to Cramer et 
al. [10], which is adaptively secure and tolerates any number of corruptions 
below one half of the servers. In order to implement secure channels required 
by Cramer et al. [10], we use the non-committing encryption technique due to 
Canetti et al. [6] and Damgard and Nielsen [13]. 

The protocol described above will be secure: suppose we are given a target 
public key {PK, E, S{SK)). Our goal is to construct a simulator S which, on 
input the identity of an inconsistent party Pg, simulates the adversary’s view 
of the computation provided the adversary does not corrupt Pg. We can use 
the simulator Smpc as a subroutine. For Step 1, S will give Smpc the value 
{PK, E, S{SK)) as the target output. We will also supply it with some random 
coins for parties that are corrupted at the beginning. As more parties get cor- 
rupted over time, Smpc will request that we provide it with their inputs. S 
will just provide some more random coins each time that happens. If the adver- 
sary ever tries to corrupt the inconsistent party Pg, S aborts. Since the actual 
randomness of the algorithm is the exclusive-or of the coins of all parties, the re- 
sulting view will be correct. For Step 2, we will first run the one-server algorithm 
for generating the simulated auxiliary information and the simulated secret in- 
formation for all but one party Pg. This one-server algorithm is described in 
section 5.2. We will then run the simulator Smpc in the same way as for Step 1. 
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4.2 Computing a Signature 

Signature generation is done in three steps: (1) generation of (y',e); (2) gen- 
eration of a verifiable additive sharing of e~^ mod and (3) computation 

of y such that y® = '> where x' = mod N, i.e. computation of 

mod N. 

Adaptively secure erasure-free threshold protocols for selecting a random 
number already exist (see, for example, the one due to Jarecki and Lysyan- 
skaya [21]). These can be employed for performing Step 1. 

Suppose a secure protocol for computing an additive sharing of e~^ mod 
4>{N) has been performed. Let di denote the share held by player Pi. Suppose 
it is backed up with a public ciphertext E{di). Each player Pi computes x' = 

mod N and reveals {xh^^^ and proves that this was done correctly 
by invoking the A’-protocol for proving equality of discrete logarithms. (If a 
player fails, his di is decrypted so that the other players can compute whatever 
is needed without him.) This takes care of Step 3. 

Therefore, the only challenging piece is the computation of a verifiable addi- 
tive sharing of e~^, i.e.. Step 2. Following the example of Catalano et ah, we cast 
this as the modular inversion problem. The problem is as follows: suppose E((/)) 
is public. On input e, the task is to compute an additive sharing of the value 
d = e~^ mod 4>, with public backup encryptions of each share. For the problem 
at hand, the value 4> is, of course, 4>{N). 

Simulating signature generation given a signature. Suppose the simula- 
tors for each specific steps are given (simulators for steps 1 and 3 are known; the 
one for step 2 is given below). Here is how we simulate the signature generation. 
Our input is a signature on message m: (e,y',y). First, we run the simulator 
for Step 1 and simulate the distributed generation of (e,y'). Then we run the 
simulator for Step 2 and arrive at a verifiable additive sharing of e~^ . Finally, we 
run the simulator for Step 3 to simulate raising the value xh^^^ ^ to the power 
e~^ to obtain y. 

Background for the modular inversion protocol. Catalano et al. [8] present 
two versions of a modular inversion protocol which are secure against a static 
adversary. The first is private but not robust, while the second adds robustness at 
the cost of more complexity. Here we give an adaptively secure version, based on 
their simpler protocol. Our protocol requires 0{lk) bits to be broadcast, which 
is the same cost as the protocol of Catalano et al. 

We assume the existence of a homomorphic threshold cryptosystem, defined 
in Appendix B. We denote an encryption of a message x as x when the public 
key is clear from the context. We also assume a trapdoor commitment scheme 
as described in section 3. 

Using an adaptively secure multiparty U-protocol, the Mult protocol from 
Cramer et al. [11] is secure against an adaptive adversary as well, because its 
simulator only uses a single inconsistent party. 

A preliminary subprotocol. First we assume existence of a secure protocol 
Mad (meaning “multiply, add, decrypt”) which has the following specification: 
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(1) public inputs w,x,y,z to all parties, (2) public output F = wx + yz for all 
parties. 

Given a suitable homomorphic threshold cryptosystem, Mad can be imple- 
mented using the secure Mult and Decrypt protocols. We give the protocol 
and a proof of security in Appendix A. 

Two preliminary A-protocols. In the inversion protocol, each party provides 
a ciphertext and must prove that it is an encryption of zero. In addition, each 
party must publish a ciphertext and prove that the corresponding plaintext 
lies within a specified range. We describe both of these proofs for the Paillier 
cryptosystem in section 5.1. 

The inversion protocol. The Invert protocol has the following specification: 

— common public input {pk,e,N,(j),{ki}). Here pk is the public key of the 
homomorphic cryptosystem, e is a prime to be inverted modulo the secret 
(j), N is an upper bound on the value of (j), and {ki} is the set of all public 
trapdoor commitment keys. 

— secret input ski, the tth secret key share, to party Pi. 

— common public output di where di is described below. 

— secret output di from each party. The {di} constitute an additive sharing of 
the inverse, i.e. ^p.^pdi = e~^ mod <j). 

The protocol proceeds as follows: 

1. Each Pi publishes a random encryption Oi of zero, and proves that it is valid 
(see section 4.2). All parties internally compute (pp = (fflO^) ffl p. 

2. Each Pi chooses random Xi from the range [0. . ■ N'^], and random Vi from 
the range [0 . . . N^], and encrypts them to get Xi and ff, respectively. 

3. Each Pi broadcasts a commitment to his ciphertexts Xi and Ty. 

4. Step R. Each Pi decommits by broadcasting Xi and rj, and the random 
strings used to generate the commitments. 

5. Each party proves that its Xi and values are within the proper respective 
intervals: each party first publishes commitments to both values, then proves 
that the committed values are the same as their respective plaintexts, and 
finally proves that the committed values are within range. 

6. Each party proves knowledge of its plaintexts Xi and using a multiparty 
A-protocol. Let A = ^i^p Xi, R = ^i^p Vi, and F = Re + Xp. 

7. The parties run the Mad protocol on e, R, X, and pp, where R = 
ffljgpfi, A = ffligpAi by addition of ciphertexts. This protocol securely com- 
putes the value F = Re + Xp as the common output. 

8. Each party determines whether (e, F) = 1. Because e is prime, (e, F) yf 1 
only if e divides A, which happens with probability about 1/e because at least 
one Xi is chosen at random. If (e, F) yf 1, the parties repeat the protocol. 
Otherwise, all parties compute a, b such that 

aRe + aXp + be = 1 
{aR + b) = e~^ mod p. 



aF + be= I 
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Pi’s share is di = avi for i > 1, and d\ = ari + 6 for i = 1. Note that any 
party can use the homomorphic properties of the cryptosystem to compute 
an encryption di for any i G P, because the values of a and b are known to 
all parties, as well as encryptions ff for all i £ P. 



Theorem 1 (Security of inversion protocol). For t < 1/2, Invert is an 
adaptively t-secure protocol for computing an additive sharing of e~^ mod 4>. 

Proof: We will assume a secure key-generation protocol for the homomorphic 
cryptosystem. We describe the construction of such a protocol for a threshold 
Paillier cryptosystem in section 5. We will also assume a secure key-generation 
protocol for the trapdoor commitment scheme. 

Let kp be the public commitment keys for all the parties. Let Ps be the 
inconsistent party and tg be its trapdoor value determined by the simulator for 
the key-generation protocol. Given A, we will construct a simulator subroutine 
S Invert which takes input {A,pk, e, N,(j>,kp, Ps, ts)- 
Sinvert Operates as follows: 

1. For each honest Pi except Ps, honestly publish and prove validity of a ran- 
dom encryption of zero. For Ps, publish a blinding oi N S (p and run £s 
(see section 3) with the trapdoor value ts to give a false proof of validity (do 
not extract witnesses from the corrupt parties, however). If any parties gave 
invalid proofs, restart the protocol and exclude them. At this stage, pp = N, 
and all of the parties hold an encryption of N instead of an encryption of p. 

2. Through the decommitment phase, behave honestly. That is, choose random 
Xi and ri for each honest party, commit to their ciphertexts, and decommit 
honestly. If any parties fail to decommit, restart the protocol and exclude 
them. 

3. During the round in which the parties prove plaintext knowledge, use the 
subroutine SpoPK to determine the values Xi and ri for all corrupted parties 
Pi who supplied valid proofs. If any parties fail to give valid proofs, restart 
the protocol and exclude them. 

4. Set R' = J2ieP P = J2ieP SMad on e, ffligpfl, ffljgpA*, (pp = 

N, and F' = R'e + X'N. 

5. Proceed exactly according to the protocol, repeating if {e,F') yf 1. 

It is clear that the simulator runs in expected polynomial time. It remains to 
be shown that the output of the simulator is computationally indistinguishable 
from the output of a real run of the protocol. Let us assume for now that this is 
not the case, and that there is an adversary A which can distinguish between a 
real-life execution of Invert and an interaction with Sinvert with non-negligible 
advantage. We will provide a reduction that uses A to break the semantic security 
of the cryptosystem, thus establishing a contradiction. The reduction will employ 
a hybrid simulator interacting with A. 

Consider the simulator Spybrid which receives the public key pk of the ho- 
momorphic cryptosystem, the public commitment keys kp, the identity of the 
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inconsistent party Ps, and its commitment trapdoor value ts- In addition, it is 
supplied with N, e, </>, <f>, a ciphertext b where b is either 0 or 1, and an auxil- 
iary input representing the state of the adversary A. The adversary’s interaction 
with Sfiybrid, and its resulting decision (whether the interaction was real or sim- 
ulated) , will determine with non-negligible probability whether b was 0 or 1 . The 
hybrid simulator works as follows: 

1. For each Pi Ps, publish a random encryption of zero and proves its validity. 
For Ps, publish {N — (j>) E\b and give a false proof of validity using Ss and 
the trapdoor ts- Let be as in the Invert protocol. 

2. For all uncorrupted Pi yf Ps, honestly choose \ and and commit to 

their ciphertexts. For Ps, choose As and rs from the proper ranges, but use 
the commitment trapdoor ts to create cheating commitments. Receive all 
commitments from the corrupt parties. Let the set of corrupt parties at this 
point be called C. Let \h = '^i^c = '^i^c 

3. For all honest Pi yf Ps, decommit the ciphertexts honestly. For Ps, open 
the cheating commitments as As and rs- If any parties fail to open their 
commitments, restart the protocol (excluding those parties forever). 

4. Honestly prove plaintext knowledge for all honest parties, and use EpoPK to 

extract \ and Vi for all Pi G C who provided valid proofs. If any parties fail 
to give valid proofs, restart the protocol (excluding those parties forever). 
Let Ac = X)i6C = '^i^c Solve for and R'^ such that F = 

{\c + Xh)(I)+ {Rc + Rh)g = {Xc + X'fj)N + {Rc + R'H)e. We shall prove that 
such X'fj and R'^j are easy to compute, and are statistically indistinguishable 
from Xh and Rp (respectively). Then execute the following loop: 

a) Rewind the adversary to Step R in the protocol. For all honest parties 
Pi yf Ps, again honestly decommit to their ciphertexts Xi, rl- For Ps, 
open the cheating commitments as blinded ciphertexts X'g and r'g, where 
Ag = As ffl ((A'^ — Xh) H b), and = fs ffl {{R'h ~ Rh) H b). Receive 
decommitments from each corrupt party (which, if valid, must be the 
same as the earlier valid decommitment). If any parties refuse to open 
their commitments, go to the beginning of the loop. 

b) Honestly prove plaintext knowledge on behalf of all honest parties 
Pi yf Ps, and use SpoPK and the commitment trapdoor ts to provide 
fake proofs of plaintext knowledge for Xg and r'g. Also receive proofs of 
plaintext knowledge from the corrupt parties. If any parties give invalid 
proofs, go to the beginning of the loop. 

c) Exit the loop. 

5. Run SMad on e, R and A (as in the real protocol), (ftp, and the value F. 
Finish according to the protocol. 

First we must analyze the running time of the hybrid simulator. It suffices to 
show that the loop is executed a polynomial number of times in expectation. Note 
that the loop is only reached if all parties open their commitments and prove 
plaintext knowledge correctly. Let eo be the probability that the the adversary 
behaves in this way, given that Ps publishes random encryptions Ag, rs, and let 
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£i be defined similarly, given that Ps publishes random encryptions X'g, r'g. By 
semantic security, cq is negligibly close to ei. The contribution of the loop to the 
expected running time, therefore, is negligibly more than eq times the expected 
number of times the loop is executed (which is 1/eo), so the contribution of the 
loop is 0(1). 

We now prove the correctness of the reduction. Certainly if the adversary 
corrupts any party besides Ps, the hybrid can supply a valid computation history 
because it is acting honestly on behalf of that party. We now show that if 6 = 0, 
the output of Snybrid is indistinguishable from a real run of the Invert protocol. 
Similarly, if 6 = 1, the output is indistinguishable from the output of S invert ■ 
Therefore an adversary that can detect a simulation of Invert can be used to 
break the semantic security of the underlying cryptosystem. 

First, assume that 6 = 0. Then it is easy to verify that the hybrid acts 
honestly on behalf of all the uncorrupted parties, and in the first round Ps 
indeed publishes a random encryption of zero, so (f>B = 4>- The only deviation 
from the real protocol occurs in the creation of cheating commitments for Ps and 
in the proofs of plaintext knowledge, but these commitments are computationally 
indistinguishable from honest commitments. Because Xg = As and Pg = rs, the 
behavior of Ps is indistinguishable from an honest party’s behavior in the real 
protocol. 

Now assume that 6=1. Then Ps publishes a random encryption of iV — (/> as 
in the simulation, and (j>B = N. Note that all Xi, rt belonging to honest parties 
are chosen uniformly except for X'g and r'g. But as we will show, the distributions 
of those variables are statistically indistinguishable from the respective uniform 
distributions. So in fact the behavior of Ps in the hybrid is indistinguishable 
from its behavior under S invert ■ 

It only remains to be proven that Xs,rs are similarly-distributed with X'g, r'g 
(respectively), which we do here. We assume for simplicity that N — (f> = 0 {Vn), 
as is the case when (p = 4>{N) and N is the product of two large primes of 
approximately equal size. First we state the following lemma: 

Lemma 1 ([8]). Let x,y he two integers such that {x,y) = 1 and A,B two 
integers such that A < B , x,y < A, and B > Ax. Then every integer z in the 
closed interval [xy — x — y+ 1, Ax + By — xy + x + y — 1] can he written as ax + by 
where a G [0,4l] and 6 € [0,B]. Furthermore, there exists a polynomial-time 
algorithm that on input x, y, and z, outputs such a and 6. 

Let us denote A as Ac + Xh, X' as Ac + A^, R as Rc + Rh, and R' as 
Rc + R'li- We apply this lemma twice, first with x = 4>, y = e, and again with 
X = N, y = e to conclude that any integer F in the interval [6, A] can be written 
both as X(j> Re, and as X'N R' e, where A, A' S [0, nN'^] and R, R' G [0, nN^] . 
Here 6 = Ne — e -I- 1, and A = N^e) — pe -\- 4> -\- e — 1. 

Now, given any fixed Xc,Rc (the sums of the adversaries’ chosen values in 
the protocol) and any Xh (respectively, Rh) distributed as the sum of at least 
n/2 honestly-chosen uniform values from [0, iV^] (respectively, [0, N^]), it is easy 
to see by Chernoff bounds that the probability that F falls outside the range 
[6, A] is negligible since both bounds fall far away from the mean of F. 
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Now suppose F e [5,^] and Xc,Rc are fixed as in the protocol. Given a 
pair Xh, Rh such that F = (Xc + Xh)(I) + {Rc + Rh)b, we present an efficient 
mapping that produces X'^, such that F = (Ac + A'^)N + {Rc + R'h)^- That 
is, X(j) — X'N = {R' — R)e. Since {N, e) = 1, for any given A there exists a unique 
and efficiently-computable A' G [A, A + e — 1] such that X(j) — X'N is a multiple 
of e. This determines the value X'^j — Xh + Xs = X'g (one of the values published 
by the first honest party in the hybrid simulator), and from that we can solve 
for R' — R + rs = r'g (the other published value) . 

We need only show that Xs,X'g and rs,r's close enough in a statistical 
sense, i.e. that their differences are small relative to the sizes of the intervals 
from which they are drawn. Indeed, and 



, M |A().-A'iV| 
Ti-ffil = : 



{X-X')(j) X'{(j)-N) 



< 









Thus which again is negligible. This completes the proof. 



5 An Adaptively Secure Threshold Paillier Cryptosystem 

We introduce the following notation: for any n G N, A(n) denotes Carmichael’s 
lambda function, defined as the largest order of the elements of Z* . It is known 
that if the prime factorization of an odd integer n is Hti then A(n) = 
lcmi=i„.fc(g/'“^(gi - I)). 

Our protocols will make use of two tools: Shamir secret sharing over the 
integers [26], and proofs of discrete log equality in groups of unknown order [9, 

4 ]. 



5.1 The Paillier Cryptosystem 

The Paillier cryptosystem [24] is based on composite- degree residuosity classes, 
and has the desired homomorphic properties. It is based upon the Carmichael 
lambda function in Z* 2 and two useful facts regarding it: for all w G Z* 2 , = 

1 mod n, and = 1 mod n^. Here we recall the cryptosystem. 

Key generation. Let n = pq where p, q are primes. Let 5 = (1 + n)°'b''' mod n? 
for random a, 6 G Z* . The public key is (n, g) and the secret key is A(n). 
Encryption. To encrypt a message M G hn, randomly choose a: G Z* and 
compute the ciphertext c = g^x"" mod n^. 

Decryption. To decrypt c, compute M = f [gX(n) ^ " 2 ] mod n where the do- 
main of L is the set S'„ = {m < : u = 1 mod n} and L{u) = 

Other useful properties. The Paillier cryptosystem is homomorphic, in the 
sense of the definition in Appendix B. Cramer et al. [11] provide A7-protocols for 
proof of plaintext knowledge and proof of correct multiplication. We also require 
a proof that a ciphertext is an encryption of zero; is merely a proof of nth residu- 
osity modulo IT? . Such a proof and is virtually identical to a zero-knowledge proof 
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of quadratic residuosity mod n as given by, for example, Goldwasser et al. [20] 
Finally, we require a proof that, given a ciphertext, the corresponding plaintext 
lies within a specified range. Boudot [2] describes such a proof for committed 
values, and a proof of equality between a committed value and a ciphertext in 
the Paillier cryptosystem can be constructed using standard techniques (see, for 
example, Camenisch and Lysyanskaya [3]). 

The security of the scheme is based upon the composite residuosity class 
problem, which is exactly the problem of decrypting a ciphertext. Semantic 
security can be proven based on the hardness of detecting nth residues mod 

O 

n . 

Fouque et al. [16] present a threshold version of the Paillier cryptosystem, 
using techniques developed by Shoup [27] for threshold RSA signatures. The 
version presented there is known to be secure only in the static adversary model, 
assuming the semantic security of the non-threshold version. 



5.2 An Adaptively Secure Threshold Version 

Here we present the novel result of a threshold Paillier cryptosystem which is 
secure in the adaptive adversary model, based upon the security of Paillier’s cryp- 
tosystem and the existence of trapdoor commitment schemes. This cryptosys- 
tem is inspired by the statically-secure threshold version presented in Fouque et 
al. [16] 

Description of the protocols. Recall A = ll, where I is the number of parties. 
Key generation. We first describe key generation in terms of an Fparty func- 
tion on input k, the security parameter. This function is evaluated by a trusted 
party, who distributes the proper values to the parties. 

Choose an integer n, the product of two strong primes p, q of length k such 
that p = 2p' + 1 and q = 2q' + 1, and gcd{n, (p{n)) = 1. Set A = 2p'q' = A(n). 
Choose random (a,b) ^ Z* x Z* , and let p = (1 -|-n)“6” mod n^. The secret key 
is the value f3X for a random (3 ^ which is shared additively as follows: for 
all parties Pi but one, choose random Sj ^ Z„a> and choose the last Si such that 
niod nX. The public key is the triple {g, n, 9), where 9 = a(3X mod 
n. To compute public verification keys, choose a random public square v from 
Z* 2 , and let Vi = v®* mod . In addition, compute polynomial backups for each 
Si as follows: let ai^ = Asi, and choose random ^ [— Z\^n^/2, . . . , A^n^ /2], 
then define a polynomial over the integers fi{X) = /*(0) = 

Asi). To each party Pj, give the values fi{j) for all i. Finally, compute public 
commitments for these backup shares using any perfectly-hiding commitment 
scheme, such as Pedersen’s [25]. Let the public value Wij be a commitment to 
fi{j) under public key kj and random string Vij, and give rij to party Pj. 

It is well known [19,1,6,10] that for any Fparty function, there is an adaptively 
secure protocol which evaluates it. Therefore there is a simulator which, given 
all the outputs of the function (excluding any values belonging only to Ps), 
interacts with the adversary and gives it a suitable view of the key generation 
protocol. In section 5.2 we describe how to provide suitably-distributed inputs 
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to this simulator. This key generation protocol may be very inefficient, but it is 
only executed once to initialize the threshold cryptosystem. 

Encryption. To encrypt a message M G Z„, pick random x ^ Z* and compute 
the ciphertext c = mod n?. 

Computing decryption shares. Player Pi computes his decryption share Ci = 
c®* mod n^, and proves via a 17-protocol that cf (in base c^) and Vi (in base v) 
have the same discrete log Si in Z* 2 • 

Combining shares. If any party Pi refuses to publish his Ci, or gives an invalid 
proof, then the other parties reconstruct his secret share Sj as follows. Each 
party Pj publishes its backup share fi(j) and random string Xij, and all parties 
verify that Wij matches those values. Because there are at least t -I- 1 honest 
parties, each party may pick some t + 1 honestly-published values fi{j), and by 
interpolation, discover Si = fi{0)/A and compute Ci = c®* mod n^. 

Now each party has a correct value Ci = c®* mod , for all i. The message 
can be computed by each party as follows: 



^(n*epcO i(/AM) 

9 ~ 9 ~ 9 ~ 9 



a/3XM 

9 



= M mod n 



since the value 9 = a/3A mod n is part of the public key. 

Simulating decryption. The input to the decryption simulator is a tuple 
{{si},{vp},{<^i,j}Awi,j},{h},9,n,9,v,c,M,Ps,ts,A). The sets and (g,n,9) 
are simulated values corresponding to those in the real protocol; c is the ci- 
phertext to be decrypted and M is its decryption; Ps is the identity of the 
single inconsistent party and ts is its commitment trapdoor; A is an arbitrary 
input corresponding to the state of the adversary before the protocol execution. 
In the next section, we describe how these simulated values can be generated 
from only a public key from the single-server Paillier cryptosystem. 

The simulator acts honestly on behalf of all uncorrupted parties Pi (excluding 
Ps) by publishing Ci = c®* mod v? and proving correctness of the decryption 
shares. On behalf of Ps, the simulator publishes eg = (1 -I- M9n) c“®* mod 

and provides a false proof of correctness using ts- If any corrupted party 
fails to provide a correct decryption share, the simulator honestly interpolates 
that party’s secret share as in the decryption protocol, and proceeds normally. 
The simulator then honestly computes the plaintext by multiplying the published 
shares, yielding {1 + M9n) mod tt? , applying L, and dividing by 9 to get common 
output M. 

The view of the adversary under the simulation is statistically indistinguish- 
able from a real run of the protocol, provided that all public inputs are suitably 
simulated. If the adversary corrupts any party Pj (other than Ps), that party’s 
behavior over every invocation of the protocol is consistent with the secret Sj 
revealed to the adversary. In addition, the adversary is entitled to see fi{j) and 
Tij, for all i. When j yf S, the values are consistent with anything else the adver- 
sary has seen. For i = S', we prove below that with high probability, any set of at 
most t values fs{j) is distributed similarly regardless of the value being shared, 
and therefore the simulated values fi{j) are statistically indistinguishable from 
those in a real run. 
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Simulating key generation. We now show that the outputs of the key gener- 
ation function can be simulated (up to statistical closeness), given a public key 
{g',n) and the identity of the single inconsistent party Ps- (It is sufficient to 
simulate every value produced by the key generation function, except the secret 
share ss belonging to Ps- This is because the entire simulation is aborted if the 
adversary ever attempts to corrupt Ps, so we need not simulate its private data.) 
When these values are given to the simulator for the key-generation protocol, it 
generates a suitable view for the adversary. 

Choose random (x,y,9) ^ Z* x Z* and let g = {g'Yy^ mod tY . Choose 
random a <— Z*, and let v = Then for each player, choose random Si <— 
[0, . . . , [n/2j — 1], and create verification shares Vi = v^' mod rY for all parties but 
Ps- For Ps, set ug = (1 -I- 2a6n)v~ mod rY- Finally, create commitments 

Wij honestly (from polynomials with free terms Asi and random coefficients) 
for all i and j, and random Tj j. 

First, note that the statistical difference between the uniform distributions on 
[0, . . . , [n/2j — 1] and is 0(n“^/^), so any set of at most I — 1 secret keys Si is 
statistically indistinguishable between a real and simulated run. Both g and 6 are 
uniformly chosen from their respective domains, and are identically-distributed 
with their respective values in the real protocol. In addition, w is a random 
element of Q„ 2 , the cyclic group of squares mod n^. Because |Q„ 2 | = pqp'q', 
and 4>{pqp'q') = (p— 1)(<Z ~ Y(p' ~ ~ is a generator of Qn^ with high 

probability, and is identically-distributed with its value in the real protocol. 

Note that any set of at most 1 — 1 simulated verification keys Vi is statistically 
close to a real set. However, in the real protocol with a fixed v, the values of ? — 1 
verification shares induce a distribution upon the last (because the values of I — 1 
secret shares Si induce a distribution upon the last). That is, it is necessary and 
sufficient that mod rY for some uniformly-chosen /3 from Z*. In 

the simulation, we choose OieP = (1 + 2a9n) mod rY without knowing 

A but just by randomly choosing 9, which induces a uniform distribution upon 
P as desired. 

Finally, we note that the simulated set {wij} is identically-distributed to 
its counterpart in the real protocol, by the perfect-hiding of the commitment 
scheme. 

It remains to be shown that the simulated values fi{j) for all i and for 
the adversary’s chosen j are indistinguishable from those in a real run. It is 
clear that the fi{j) are identically distributed for z yf S', because the simulator 
behaves honestly. It is also obvious that the points of different polynomials are 
independent. We therefore show that with high probability, the values fsU) 
seen by the adversary are consistent with a polynomial having free term Ass 
and coefficients from the proper range, for any value of §s- 

Let fs{X) be the polynomial used in the simulation, that is, /s(AT) = 2\s5-l- 
X)i=i where the asg are randomly chosen. Say that the adversary has 

corrupted a set of parties C, with \C\ < t- We wish to find a polynomial /s(AT) 
such that /s(0) = Ass for an arbitrary ss, and /s(z) = /s(z) for i G C- Consider 
a polynomial h{X) such that h{0) = Z\(ss — ss), and h{i) = 0 for z G C. Then 
we have f{X) = f{X) -|- h{X)- By interpolation. 
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h{x)=Y,K^- n 737 = - ''s) n ^ 

iec iAi,iG{o}uc j&c •’ 

so the coefficient of AT* in h(X) is: A(ss — ss) ^bcc \B\=i ^ which 

is bounded in absolute value by 

^ Z\(ss-ss) < Z\(ss-ss) < A{ss-ss)tl < A?r? j2 

since sg, ss e {0, . . . n^/2}. 

Now the coefficients of f{X) are outside of the desired range only if any of 
the coefficients of f{X) are outside of [—A‘^(n^ — n^)j2, . . . , A‘^{n^ — n^)/2]. By 
the union bound, this happens with probability at most t/n, which is negligible. 
In addition, there is a bijection between the coefficients of / and the coefficients 
of / when ss, ss, and C are fixed. Therefore the distribution of the coefficients 
of / is statistically close to uniform, as desired. 

A reduction from the original cryptosystem. With these simulations in 
hand, the reduction from one-server semantic security to threshold semantic 
security is straightforward. Assume there is an adversary that can break the 
security of the threshold cryptosystem. Given a public key (g',n) for the single- 
server Paillier cryptosystem, we first simulate the key generation protocol and 
any decryptions as described above. (Recall that the public key of the threshold 
cryptosystem is {g = mod n^, n, 0) for some uniformly-chosen x,y,9.) 

The adversary then outputs two messages mo, mi, which we send to an ora- 
cle, who responds with a random encryption c of mt for some random bit b. We 
compute X = mod (where x is the value chosen by the key generation simu- 
lator) and give it to the adversary. By assumption, the adversary can distinguish 
with non-negligible advantage whether x is an encryption of mo or mi under 
{g,n,9). This is equivalent to whether c is an encryption of mo or mi under 
{g' ,n), hence we have broken the semantic security of the original cryptosystem. 
This completes the reduction. 
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A The Mad Protocol 

The Mad protocol takes common inputs w,x,y,z and returns common output 

F = wx + yz. It is implemented as follows: 

1. Each party publishes a trapdoor-commitment to a random string for use 
in the multiplication-by-ring-element algorithm. 

2. The parties open their commitments, and compute r as the exclusive-or of 
all properly-decommitted strings. 

3. Each party runs the multiplication- by-ring-element algorithm on inputs w 
and X with random string r, yielding a common random ciphertext wx. 

4. The parties enter the Mult protocol on y, z, yielding common random ci- 
phertext yz. 

5. Each party uses the deterministic addition-of-ciphertexts algorithm to com- 
pute a common input wx + yz to the Decrypt protocol, yielding common 
output F = wx + yx, as desired. 

For the proof of security, we refer the reader to the full version [22] of the 

paper. 
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B Homomorphic Threshold Encryption 

Here, for self-containment, we provide a modification of the definitions given by 

Cramer et al. [11]. The modification is that we require security in the adaptive 

setting. 

B.l Threshold Cryptosystems 

Here we define threshold encryption schemes and their security properties. 

Definition 1. An adaptively-secure threshold cryptosystem for parties P = 

{Pi , . . . , Pi} with threshold t < I and security parameter k is a 5-tuple 

{K, KG, M, E, Decrypt) having the following properties: 

1. (Key space) The key space K = is a family of finite sets of the 

form {pk,sk\, . . . ,ski). We call pk the public key and ski the private key 
share of party Pi. For C C P we denote the family {ski}i^c by skc- 

2. (Key generation) There exists an adaptively t-secure key generation I -party 
protocol KG which, on input 1^, computes, in probabilistic polynomial time, 
public output pk and secret output ski for party Pi, where {pk, ski, ■ ■ ■ , ski) G 
Kk- We write {pk, ski, • ■ • , ski) ^ KG(l^) to represent this process. 

3. (Message sampling) There exists some probabilistic polynomial-time algo- 
rithm which, on input pk, outputs a uniformly random element from a mes- 
sage space Mpk. We write m ^ Mpk to describe this process. 

4 . (Encryption) There exists a probabilistic polynomial-time algorithm E which, 
on input pk and m G Mp^, outputs an encryption m = Epi^{m)[r] ofm. Here 
r is a uniformly random string used as the random input, and Epk{m)[r] 
denotes the encryption algorithm run on inputs pk and m, with random tape 
containing r. 

5. (Decryption) There exists an adaptively t-secure protocol Decrypt which, 
on common public input {c,pk) and secret input ski for each uncorrupted 
party Pi, where ski is the secret key share of the public key pk (as generated 
by KGj and c = Epi^{m)[r] is an encrypted message for some r, returns m 
as a common public output. 

6. (Threshold semantic security) For all probabilistic circuit families {S'fc} (the 
message sampler j and {Dk} (called the distinguisher all constants c > 0, 
all sufficiently large k, and all C Q P such that jCj < 

Pr[ {pk, .ski,..., ski) ^ KG(l'"); (mo, mi, s) ^ Sk{pk, skc); i ^ {0, 1}; 
e <— E{pk, mi); b ^ Dk{s, e) : b = i] < 1/2 -\- 1/kT 

B.2 Homomorphic Properties 

We also need the cryptosystem to have the following homomorphic properties: 

1. (Message ring) For all public keys pk, the message space is a ring in 
which we can compute efficiently using the public key only. We denote the 
ring {Mpk, -pk, +pk, Qpk, Ipk)- We require that the identity elements Opk and 
Ipk be efficiently computable from the public key. 
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2- (+pfc-homomorphic) There exists a polynomial-time algorithm which, given 
public key pk and encryptions mi € Epk(jni) and m2 G Epk{m2), outputs 
a uniquely-determined encryption m G Epk{mi +pk ^2). We write m = 
mi fflm2- Likewise, there exists a polynomial-time algorithm for performing 
subtraction: m = mi B m2- 

3. (Multiplication of a ciphertext by a ring element) There exists a prob- 
abilistic polynomial-time algorithm which, on input pk, mi G Mpk and 
m2 G Epk{m2), outputs a random encryption m ^ Epk{m\ -pk m2)- We 
assume that we can multiply a ring element from both the left and right. We 
write m ^ mi Bm2 G Epk{m\ -pk m2) and m ^ mi Bm2 G Epk{m\ -pk m2)- 
Let (mi □ Wi2)[r] denote the unique encryption produced by using r as the 
random coins in the multiplication-by-ring-element algorithm. 

4. (Addition of a ciphertext and a ring element) There exists a probabilis- 
tic polynomial-time algorithm which, on input pk, m\ G Mpk and m2 G 
Epk{m 2 ), outputs a uniquely-determined encryption m G Epk{mi -Gpk m2). 
We write m = m± ffl m2. 

5. (Blindable) There exists a probabilistic polynomial-time algorithm B which, 
on input pk and m G Epk{m), outputs an encryption m' G Epk{m) such that 
m! = Epk(m)[r\, where r is chosen uniformly at random. 

6. (Check of ciphertextness) By Cpk we denote the set of possible encryptions 
of any message, under the public key pk- Given y G {0, 1}* and pk, it is easy 
to check whether y G Cpk- 

7- (Proof of plaintext knowledge) Let Li = {{pk,y) : pk is a, public key A y G 
Cpk}- There exists a A-protocol for proving the relation RpoPK over Li x 
{{0,1}*)'^ given hy RpopK = {{{pk,y),{x,r)) : x G Mpk A y = Epk{x)[r]} - 
Let SpoPK be the simulator for this A-protocol, which is just a special case 
of Ss described in section 3. 

8. (Proof of correct multiplication) Let L 2 = {{pk, x, y, z) : pk is a public key A 
x,y,z G Cpk}- There exists a A-protocol for proving the relation RpocM 
over L 2 X ({0,1}*)^ given by Rpocm = {{{pk,x,y,z),{d,n,r 2 )) : y = 

Epk{d)[ri] Az = {dBx)[r 2 ]}- 

We call any such scheme meeting these additional requirements a homomor- 
phic threshold cryptosystem- 

From these properties, it is clear how to perform addition of two ciphertexts: 
use the -\-pk algorithm, following by an optional blinding step. The remaining 
operation to be supported is secure multiplication of ciphertexts. That is, given 
d and b, determine a ciphertext c such that c = a -pk b, without leaking any 
information about a, b, or c. Cramer et al. [11] give the Mult protocol for 
multiplication, and prove its security against a static adversary. 
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Abstract. Semantic security against chosen-ciphertext attacks (IND- 
CCA) is widely believed as the correct security level for public-key en- 
cryption scheme. On the other hand, it is often dangerous to give to only 
one people the power of decryption. Therefore, threshold cryptosystems 
aimed at distributing the decryption ability. However, only two efficient 
such schemes have been proposed so far for achieving IND-CCA. Both 
are El Gamal-like schemes and thus are based on the same intractability 
assumption, namely the Decisional Diffie-Hellman problem. 

In this article we rehabilitate the twin-encryption paradigm proposed 
by Naor and Yung to present generic conversions from a large family 
of (threshold) IND-CPA scheme into a (threshold) IND-CCA one in the 
random oracle model. An efficient instantiation is also proposed, which is 
based on the Paillier cryptosystem. This new construction provides the 
first example of threshold cryptosystem secure against chosen-ciphertext 
attacks based on the factorization problem. Moreover, this construction 
provides a scheme where the “homomorphic properties” of the original 
scheme still hold. This is rather cumbersome because homomorphic 
cryptosystems are known to be malleable and therefore not to be CCA 
secure. However, we do not build a “homomorphic cryptosystem”, but 
just keep the homomorphic properties. 

Keywords: Threshold Cryptosystems, Chosen-Ciphertext Attacks 



1 Introduction 

1.1 Chosen-Ciphertext Security 

Semantic security against chosen-ciphertext attacks represents the correct se- 
curity definition for a cryptosystem [31,41,4]. Therefore a lot of works [26,25, 
38,34] have recently proposed schemes to convert any one-way function into a 
cryptosystem secure according to this security notion. 

Before this notion, Naor and Yung in [33] proposed a weaker security notion 
that they called lunch-time attack (a.k.a. indifferent, or non-adaptive, chosen- 
ciphertext attack) . The adversary can only ask decryption of ciphertexts before 
he receives the target ciphertext. Naor and Yung [33] presented a conversion to 
secure schemes against chosen-ciphertext attack in a lunch-time scenario. They 
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used non-interactive zero-knowledge proof systems (proofs of membership [9,8]) 
to show the consistency of the ciphertext, but not to prove that the people who 
built the ciphertext necessarily “knew its decryption” . 

Later Rackoff and Simon [41] refined this construction replacing the non- 
interactive zero-knowledge proofs of membership by non-interactive zero-know- 
ledge proofs of knowledge. Therefore, when encrypting a message, one further- 
more appends a non-interactive proof of knowledge of the plaintext, which leads 
to (adaptive) chosen-ciphertext secure cryptosystems. Indeed, the sender proves 
that he knows the plaintext and thus CCA is reduced to CPA. 

A similar notion has thereafter been defined, the so-called “plaintext-aware- 
ness” [7,4], which means that when someone builds a valid ciphertext, he nec- 
essarily “knows” the corresponding plaintext. Therefore, a decryption oracle is 
unuseful for an adversary. But this latter notion is meaningful only in the random 
oracle model [6]. 

For few years, several efficient schemes have been proposed which achieve this 
high security level. Most of them have only been proven in the random oracle 
model [7,27,48,36,25,26,38,34] using the plaintext-awareness property, but only 
one in the standard model [14]. 



1.2 Threshold Cryptosystems 

On the one hand, in public-key cryptography in general, the ability of decrypting 
or signing is restricted to the owner of the secret key. This means that only one 
people has all the power. Whereas in some situations, such an ability should not 
be given to only one people, but shared among a group of users, such that a 
minimal number of them, the threshold, is needed to sign or decrypt. 

On the other hand, the goal of cryptography is to withstand attackers. In 
the case of break-ins, z.e. adversary that can enter into a computer and steal the 
secret key, public-key systems in general are not protected against exposure of the 
secret key. As this kind of attacks done by intruders (hackers, Trojan horses) or 
by corrupted insiders are very common and frequently easy to perform, systems 
must be protected against them. Threshold cryptography can solve this problem 
by distributing trust among several components or servers. The secret key is 
then split into shares and each share is given to one of a group of servers. 

First, the key generation process has to be distributed, in order to generate 
the shares of each server, without trusted party. This has been done in both 
the discrete logarithm [37,30,21], and the RSA [10,24,20] settings. For signature 
schemes, the signing process has been distributed in both environments [43,29, 
28,22,40,47] as well. 

For distributing the decryption process, similar techniques can be used, un- 
til one just wants to prevent chosen-plaintext attacks from passive adversaries 
(see below for precise definitions). However, when we want to prevent chosen- 
ciphertext attacks, in general, servers cannot start decryption before knowing 
whether the ciphertext is valid or not because an attacker can be one of these 
servers and in case of invalid ciphertexts, he had learned some information. 
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Consequently, when we try to share a cryptosystem, we should not wait un- 
til the end of the decryption to know whether the servers can really decrypt 
or not. Therefore, we have to integrate some proof of validity of the ciphertext 
that should be publicly verifiable. Unfortunately, most of all the known cryp- 
tosystems secure against chosen-ciphertext attacks are not suitable. Indeed, in 
the decryption processes, the alleged plaintext is decrypted, and the redundancy 
is checked just before returning the plaintext. Since the redundancy involves a 
hash function, the final check cannot be done efficiently in a distributed way. 

1.3 Related Work 

There are two methods to distribute the decryption process of a cryptosystem. 
Whereas the first one uses randomness, the second follows the model described 
by Lee and Lim in [32] where the usual decryption process for attaining cryp- 
tosystems immune against CCA is reversed: the receiver starts checking whether 
the ciphertext is valid before decrypting. 

The first method has been proposed by Canetti and Goldwasser in [12]. 
In the Cramer-Shoup cryptosystem [14], the receiver can check the validity of 
a ciphertext by using one part of the secret key, before decrypting the valid 
ciphertext using the second part of the secret key. Therefore, one can think that 
it is easy to share this cryptosystem. Canetti and Goldwasser [12] succeeded 
in distributing this cryptosystem. But instead of checking the validity of the 
ciphertext in a first round and decrypting it according to the validity, they 
proposed a new strategy with only one round. The servers decrypt any ciphertext 
submitted and the decryption process is randomized. The servers compute m • 
{v' /vY where s is a random shared between the servers (part of the secret key), 
V the proof inside the ciphertext, and v' the proof calculated by the servers. In 
the centralized version, the decryption process verifies whether v = v' or not. In 
the distributed version, if the proof is correct, {v/v'Y = 1 and the decryption 
gives TO, otherwise it returns a random value. Nobody knows if the decrypted 
message is correct or not if there is no redundancy in the plaintext to. A solution 
is to decrypt twice the same ciphertext. If the results are the same, the message 
was well-formed. The main drawback is that the servers must keep in the secret 
key a sharing of a random s and hence, the length of the key is linear in the size 
of the number of decrypted messages. Gonsequently, even if the basic method 
with two rounds appears to be slower, it has nice features in term of storage and 
avoids the need of a protocol to compute a shared random. 

This method is unfortunately specific to the Gramer-Shoup cryptosystem. 
The second method used by Shoup and Gennaro [48] follows Lee and Lim 
paper [32], with the El Gamal [17] cryptosystem, but in the random oracle 
model [6]. First, they tried to add a non-interactive zero-knowledge proof of 
knowledge of discrete logarithm, using the Schnorr signature [44]. But they re- 
marked that the decryption simulation without the secret key would require an 
exponential time, because of a combinatorial explosion of the forking lemma [39] . 
This explosion can be avoided under stronger assumption [45] . They finally used 
non-interactive zero- knowledge proofs of membership (as in [33]) to avoid the 
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rewinding, and thus the combinatorial explosion in the decryption simulation. In 
fact, the simulation of the decryption process cannot rewind the machine. The 
problem is the same as in the resettable zero-knowledge setting. Therefore, the 
same techniques of proof of membership in a hard language can be used [5] . We 
can note here that the proof of knowledge of Rackoff and Simon is actually a 
proof of membership. In this cryptosystem, there are two keys as in [33] : one 
which belongs to the receiver but the other one belongs to the sender. Since 
the prover has one of the two keys, he can decrypt and obtain the plaintext. 
Therefore, the proof turns to be a proof of knowledge for a specific sender. The 
sender can then decrypt messages and since it is a proof of membership we can 
simulate the proof without using rewinding technique. 



1.4 The Basic Tool: Non-interactive Zero- Knowledge Proof Systems 

The model proposed by Naor and Yung strongly uses non-interactive zero- 
knowledge proofs of language membership in the common random string setting. 
Because of that, they had to restrict the power of the security model to lunch- 
time attacks since the adversary could use the target ciphertext and generate a 
new proof of membership. If the proof was correct, the decryption oracle decrypts 
it. But Naor and Yung cannot prove that the proof of membership cannot be 
changed by someone who does not know a witness. Indeed, they did not use any 
non-malleable property for the non-interactive zero-knowledge proof. Recently, 
this property has been considered [42], but only for theoretical proof systems. 

In this paper, we use the idealized assumption of the random oracle model [6], 
which assumes that some functions behave like truly random functions. This 
allows to build efficient non-interactive zero-knowledge proofs, without the com- 
mon random string setting, which achieve a weaker notion than non-malleability, 
but strong enough for our purpose, the simulation soundness [42]. 

Simulation Soundness. Let us consider any language C, and a non-interactive 
zero-knowledge proof system for C. For any adversary A, with access to a proof 
p* , for a word x* , in or out of T, we consider her ability to forge a new proof p, 
for a word out of C. Therefore, for any adversary A, we consider 

Succ^''^^"'^''(Al) = Pr[(a:,p) ^ A{Q) | x G £ A (x,p) ^ Q], 

having access to a bounded list Q of proven words (x*,p*), where the word w* 
is any word (in or out of the language C) and p* an accepted proof for w*. We 
denote by £ the complement of £, and thus all the words out of the language £. 

More generally, we denote by the maximal success probability 

over any adversary, with running time bounded by t, in forging a new accepted 
proof for an invalid word, even after having seen a bounded number of accepted 
proofs on (in)valid words. In our situation, this bounded number will just be 
one. 

This is a stronger notion than the classical soundness for non-interactive 
zero-knowledge proofs, but a weaker than non-malleability. Indeed, Sahai [42] 
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showed that non-malleability of non-interactive zero-knowledge proofs implies 
this notion, that he calls simulation soundness. 

As we see in the sequel, in the random oracle model, we can provide efficient 
proofs which achieve this security level. 

1.5 Our Solution 

Fujisaki and Okamoto [26] proposed a generic conversion from any IND-CPA 
cryptosystem into an IND-CCA one, in the random oracle model [6]. In this pa- 
per, we revisit the twin-encryption technique of Naor and Yung [33], by providing 
a generic conversion from any IND-CPA cryptosystem into an IND-CCA one with 
publicly verifiable validity of the ciphertext (in front of the same kind of ad- 
versary, see below). Namely, this conversion provides threshold cryptosystems 
strongly secure. We furthermore present practical instantiations in the random 
oracle model, which achieve IND-CCA against active and adaptive adversaries. 

2 Security Model 

2.1 The Network 

We assume a group of ^ (probabilistic) servers, all connected to a common broad- 
cast medium, called the communication channel. It can be an asynchronous 
channel like the Internet. 



2.2 The Adversary 

The adversary is computationally bounded and it can corrupt servers at any 
time by viewing the memories of corrupted servers (passive adversary), and/or 
modifying their behavior (active adversary). The adversary decides on whom 
to corrupt at the start of the protocol (static adversary). We also assume that 
the adversary corrupts no more than t out of i servers throughout the protocol, 
where £ > 2t + 1. 



2.3 Threshold Cryptosystems 

A t out of £ threshold cryptosystem consists of the following components: 

— A key generation algorithm 1C that takes as input a security parameter in 
unary notation 1^, the number £ of decryption servers, and the threshold 
parameter t; it outputs a public key pk, a list ski,...,sk^ of private keys 
(which represents a sharing of the private key sk) and a list vki, . . . , vk^ of 
verification keys. 

— An encryption algorithm E that takes as input the public key pk and a 
cleartext m, and outputs a ciphertext c. 
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— Several decryption algorithms T>i (for 1 < i < £) that take as input the public 
key pk, the private key sk^, a ciphertext c, and output a decryption share Ui 
(which may include a verification part to achieve robustness). 

— A recovery algorithm that takes as input the public key pk, a ciphertext c, 
and a list ai, . . . , of decryption shares (or at least t + 1 of them), together 
with the verification keys vki, . . . , vk^, and outputs a cleartext m or rejects if 
less than t + 1 decryption shares are correct in the case of active adversaries. 
All users can run this algorithm. 

2.4 Security Notions 

In this section, we define the game an adversary plays and tries to win in order to 
achieve the goal of the attack. Adversary against threshold cryptosystems tries 
to attack the two following properties : 

— Security of the underlying primitive. In the case of cryptosystem, it means 
one-wayness, semantic security [31], or non-malleability [16]. 

— Robustness. This means that corrupted players should not be able to pre- 
vent uncorrupted servers from decrypting ciphertexts. This notion is useful 
only in the presence of active adversaries. In other terms, it means that the 
decryption service is available even if the adversary can send bad decryption 
shares. 

A user who wants to decrypt a ciphertext c sends it to a special server, called 
the combiner, who forwards it to all servers. The servers start checking the valid- 
ity of the ciphertext, then compute a decryption share Ui and eventually return it 
to the combiner. This latter combines the decryption shares to obtain the plain- 
text TO and returns it to the user. If we want to withstand active adversaries, the 
combiner must decide when he receives decryption shares Ui whether they are 
valid or not. A nice way is to use checking protocols [23], and verification keys 
are consequently needed. The goal of checking protocols is to allow each server 
to prove to others that it has achieved its task correctly. 

Semantic Security. In the following, we focus on the semantic security [31] 
goal, denoted IND, and forget any other security notions (one-wayness and non- 
malleability.) Therefore, the game to consider is the following : 

1. The key generation algorithm /C is run. The adversary therefore receives the 
public key pk. With this public key, the adversary has the ability to encrypt 
any plaintext of his choice (hence the basic “chosen-plaintext attack” ) . 

2. The adversary chooses two cleartexts toq and toi. These are given to an 
“encryption oracle” that chooses b G {0, 1} at random, encrypts TO{, and 
gives the ciphertext c to the adversary. 

3. At the end of the game, the adversary outputs b' G {0, 1}. We say that the 
adversary wins the game if 6' = b. 

Semantic security against chosen-plaintext attack means that for any poly- 
nomial time bounded adversary, b' = b with probability only negligibly greater 
than 1/2. 
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Chosen Ciphertext Attacks. A stronger attack is usually considered, the 
so-called chosen-ciphertext attack [41], in which the adversary is given a full 
access to the decryption oracle I?sk, feeding it with any ciphertext. It therefore 
obtains the corresponding plaintext, or the “reject” answer. There is the trivial 
restriction not to ask the challenge ciphertext. 



Threshold Security. The above attacks are the classical attacks in the stan- 
dard (non-threshold) setting of the cryptosystem. Even if it is a threshold one, 
the view of the adversary is the same as if there would be only one secret key. 
However, in the threshold setting, we have to consider the leakage of decryption 
shares. To this aim, we give a new oracle access to the adversary: the adversary 
is given a full access to the decryption oracles I?ski, but feeding them with a 
valid pair of plaintext-ciphertext. It therefore obtains the decryption share Ui. 
If the pair is not valid (the ciphertext does not encrypt the given plaintext) 
the oracle may output anything [19]. This is therefore the basic security no- 
tion (for both IND-CPA and IND-CCA) in the threshold setting: IND-TCPA and 
IND-TCCA respectively. 

As explained in the motivation of threshold cryptosystems, such a scheme 
should resist to the corruption of some servers. Therefore, we have to consider 
this situation, which means that the adversary has control of some servers: 

— still playing honestly — the adversary is thus a passive adversary. He has 

access to any internal data of some servers, but cannot modify their behavior. 

— or modifying their behavior — the adversary is then an active adversary. 

To sum up, we have several possible mixes of attacks and adversaries: the 
chosen-plaintext (CPA) or chosen-ciphertext (CCA) attacks, performed by pas- 
sive (-Passive) or active (-Active) adversaries. According to the choice of cor- 
rupted servers, we consider adaptive or non-adaptive adversaries. Non-adaptive 
adversaries make their choice first (before anything else), whereas adaptive ones 
make their choice along the attack, adaptively. It has been proven that passive 
and adaptive adversaries are equivalent to passive and non-adaptive adversaries, 
when the number of servers is logarithmic [11]. 

One may remark that in the particular case where £ = 1 and t = 0, we are 
back to the classical situation, where passive/active and (non)-adaptive adver- 
saries are meaningless. 



3 Generic Conversions into IND-CCA Cryptosystems 

In this section, we revisit the twin-encryption paradigm proposed by Naor and 
Yung [33], while assuming that {JC,£,V) is a (possibly threshold) cryptosystem 
which already achieves semantic security against chosen-plaintext attacks (IND- 
CPA or IND-TCPA, in the threshold setting). Then, we provide a new scheme 
which prevents CCA (or TCCA, resp.) whatever the kind of adversary. 
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3.1 Generic Conversion GC 

The Key Generation: K(l^) runs twice /C(l^) to get two public keys (pk, pk'), 
which represent the new public key PK. The same way, one defines the new set 
of secret keys as SK = {SKi}i<i<£ = {sk, sk'} = {sk^, sk'}i<i<£ and the new set 
of verification keys VK = {VKi}i<i<£ = {vk,vk'} = {vk^, vk'}i<i<^. 

Encryption of m 

— one first encrypts twice m under pk and pk\ uq = £pk(m) and oi = 5pk'(m); 

— one then builds a proof that both ciphertexts encrypt the same plaintext 
under the keys pk and pk' respectively, c = Proof [pk, pk', T>sk(ao) = l?sk'(ai)]- 

Partial Decryption of (ag, ai, c) 

— the server checks the validity of the proof c; 

— it computes both decryption shares of the ciphertexts ag and ai (only one 
could be enough, but the same random choice should be done by all the 
servers). 

It is then possible to reconstruct the plaintext, using the recovery algorithm. 

With this generic construction, it is not clear that the proof c does not leak 
any information (as remarked in [33]), furthermore such a proof can seldom be 
done efficiently in the standard model. However, the random oracle model allows 
to make efficient non-interactive zero-knowledge proofs [39] . 



3.2 Non-interactive Zero-Knowledge Proofs 

In order to make the following proof to work, we need a strong security notion 
about the proof c on the language 

^ = {(pk,pk',Spk(m),£pk'(m)) |Vm}, 
called simulation soundness [42]. 

Indeed, we want that any adversary A, having seen a pair (x*,c*), where 
X* = (pk, pk', £pk(m), £pk'(m')) (with m = m' but also possibly m yf m') and c* 
an accepted proof for x*, has a negligible success probability in forging a new 
proof c for a word x ^ C\ 

Succ=™-"'"^(yf) = Pr[(cc, c) ^ A(x\ G)\xe£A (x, c) (x*, c*)]. 

The idea behind this success probability is that the adversary should not be 
able to build a new proof from previous ones, excepted for valid words (which 
means in £). Indeed, one cannot avoid the adversary to build an accepted proof 
for a correct word chosen by herself, and in such a case the ciphertext is valid. 

Furthermore, the adversary has access to a proof for a word in C, or maybe 
out of £, because the simulator will sometimes create an accepted proof for a 
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word that is not in C,. Such a proof should not give any further information to 
the adversary either. 

The proof c convinces everybody that the ciphertext is valid before starting 
the decryption. In the security proof, the decryption simulator knows one secret 
key. But the challenge ciphertext will not necessarily be a valid one (possibly 
with two distinct encrypted messages) . Thanks to the random oracle model, it is 
still possible to simulate, in an indistinguishable way, an accepted proof even for 
such a wrong string, under the assumption of the intractability of the problem 
of deciding membership (a weaker assumption than the semantic security of the 
underlying cryptosystem) . 

Finally, we present some practical non-interactive zero-knowledge proofs, 
which are easily proven to be simulation-sound using the forking lemma tech- 
nique [39]. 

3.3 Security Proof 

We show that from any adversary A against IND-CCA of twin scheme, we can 
build an adversary B against IND-CPA of the original scheme, first only consid- 
ering passive adversaries. 

3.4 Passive Adversaries 

Theorem 1. Given an IND-CPA (or IND-TCPA) cryptosystem S, the twin con- 
version provides an IND-CCA (or IND-TCCA, resp.) cryptosystem Stw, in the 
random oracle model. 

Proof. Our proof proceeds by reduction. Given a (t, e)-adversary A against our 
scheme Stw in the sense of IND-CCA, we build a (F, e')-attacker B against scheme 
S where t' = t and e' = (e — 9 • Succ^''^“"'^''(t))/4. 

First of all, one can note that if a (classical) cryptosystem is IND-CPA, then 
if we encrypt the same message under two different public keys, the resulting 
twin-cryptosystem is still IND-CPA. This result can be shown by applying hybrid 
techniques [31] and it has already been formally proven in [3,2], with a advantage 
loss (divided by 2). 

Now, we show how to make the reduction. The attacker B receives a given 
public key pk and we show how this attacker can use the adversary A that 
breaks IND-CCA to win the game (IND-CPA). The simulator B runs /C(l^) and 
gets (pk',sk' = {sk(}). He tosses a coin b, and sets pk^ = pk, while pk^_{, = pk'. 
Then, he sends (pkp, pk;^) to A. 

At the step 2 in the game, the adversary A outputs two messages mo,mi. 
The simulator B sends them to the challenger: the challenger chooses at random 
a bit b' and encrypts my under Spk,,, yielding to = i?pk^(mh'). 

Then, B tosses a new coin 5" at random and computes = £pk^_^(Tnbn) 
and sends to the adversary the target ciphertext y* = (ag,a^,c*), where c* is 
a simulated proof of correctness of aj and a(, which can be done in an indis- 
tinguishable way in the random oracle model, under the intractability of the 
decision problem: do Oq and a( encrypt the same message? 




360 



P.-A. Fouque and D. Pointcheval 



Now, we show how to simulate the decryption oracle. Adversary A can 
perform queries y = (ao,oi,(i) to the decryption oracle, at any time, where 
and c is a proof of correctness of the ciphertext. The simulator 
B easily decrypts ai-b, as he knows the secret keys related to = pk'. If 

the proof is correct we know that oq and oi encrypt the same value m. This 
simulation is perfect. If the proof is not correct, but accepted, the adversary had 
broken the simulation soundness, after having seen only one proof. 

Finally, A answers with a bit b* , which is output by B. Since the simulation 
may not be perfect, the adversary may never stop. In this latter case, after 
a time-out, B flips a coin b*. This latter has won if b* = b' , and thus with 
probability 

= Pr[5* = &'ANIZK]-kPr[6* = 6 'a-NIZK] > Pr[&* = b' \ NIZK] -Pr[NIZK]. 

In the above formula, NIZK denotes the event that none of the proofs sent by 
the adversary to the decryption oracle breaks the simulation soundness, after 
having possibly seen one proof. 

Indeed, if the adversary can forge proofs of membership, for wrong words, the 
simulator will always answer with the message encrypted under pk^ Therefore, 
the adversary can decide which key has the simulator. 

However, under the assumption NIZK, saying that the adversary did not forge 
a wrong proof, our simulation of the decryption oracle is perfect. Then, using 
the notation pr for probabilities under this assumption: 

— in the case h” = b', the simulation is perfect. Indeed, the challenge cipher- 
text is a valid ciphertext, and all the decryption queries are valid ciphertexts 
(under the NIZK assumption). And thus, the advantage is greater than e/2, 
thanks to results about multicast encryption [2,3] (excepted a possible ad- 
vantage in the real game thanks to an attack on the soundness). Thus 

pr[6* = b' I b" = b'] > ^ - PrhNIZK] = | -|- ^ - Pr^NIZKj. 

— in the case b" ^ 6', even a powerful adversary that can decrypt ag and oi, 
will obtain mo and mi. Therefore, he cannot get any advantage. However, 
the adversary who detects it may choose to never stop, or to cheat. If she 
decides to never stop, the time-out makes B to flip a coin. If she tries to 
cheat, she has no information about b'. Then, pr[&* = b' | b" yf b'] = 1/2. 

Therefore, 



e' + I ^ ( pr[6* = b' I b” = b'] + pr[b* = b' \ b" yf 6'] 



■ Pr[NIZK] 






And thus. 



(I -k I - PrhNIZK]) • Pr[NIZK] ^ ^ 1 ^ ' PrhNIZK]^ . 

e-9-PrhNIZK] 



e' = 2Pr[&* = 6'] - I > 



4 
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In order to upper bound Pr[^NIZK], we play the same game but knowing the 
two secret keys. Then, as soon as the adversary produces an accepted proof for 
an invalid word, we detect it, and thus output it. This breaks the simulation 
soundness with time t: Pr[^NIZK] < □ 

3.5 Active Adversaries 

It is clear that the proof still holds whatever the adversary is, even in the thresh- 
old setting. We provided a rigorous proof without any corruption. But if the 
underlying scheme already prevents IND-TCPA against passive or active adver- 
saries, the new one even prevents IND-TCCA against the same kind of adversaries. 

4 Examples 

The first example of semantically secure cryptosystem with easy proofs of equal- 
ity of plaintexts is certainly the El Gamal cryptosystem [17]. Even if more effi- 
cient threshold versions have already been proposed [48] (even in the standard 
model [12]), we apply the first conversion on it. 

The second example will provide the first RSA-based threshold cryptosystem 
secure under chosen-ciphertext attacks, even against active and adaptive adver- 
saries. It is based on the Paillier’s cryptosystem [35,19]. Another version to share 
Paillier cryptosystem appears in [15]. 

In this part, we describe the cryptosystems and we insist on the proofs of 
membership which are specific. 



4.1 The El Gamal Cryptosystem 

Description of the El Gamal Cryptosystem. Let p be a strong prime, such 
that q\p — 1 is also a large prime, and g be an element of Z* of order q. We thus 
denote by G the subgroup of Z* of the elements of order q. It is spanned by g. 
Let y = g^ he the public key corresponding to the secret key x. To encrypt a 
message M G G, randomly choose r G Zq and compute the ciphertext {M.y'~ ,g'"). 
To decrypt a ciphertext a = (a, j3), the receiver computes aj j3^ . It is well-known 
that the semantic security of El Gamal is based on the Decisional Diffie-Hellman 
(DDH) problem [49]. 



IND-CPA Threshold Version of El Gamal Cryptosystem. The secret key 
X is split with Shamir secret sharing scheme. Each server has a share sk^ of 
the secret key sk and a verification key vk^ = g^^^. To decrypt a ciphertext 
a = {a, (3), each server computes a decryption share (ii = and proves that 
loggvki = log^/3i. The combiner selects a set S' of t -I- 1 correct shares and 
computes 

/3® = mod p 

ies 
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where Afg denote the symbol of Lagrange. Finally, the combiner computes 
a//3“ mod p to recover the plaintext. One can easily show that if an adver- 
sary can break the semantic security of this cryptosystem, one can build an 
attacker that can break the semantic security of El Gamal, and thus the DDH 
assumption. 



IND-CCA Threshold Version of El Gamal Cryptosystem. We can there- 
fore apply previous twin conversion. One still gets one group G, with a generator 
g of prime order. Then the key generation algorithm is run twice and the public 
keys are j/g = and yi = g^^ . To encrypt a message M, the sender computes 
ao = {M ■ yo,g^) = (oo,/3o) and m = (M • = (ai,/3i). 

The proof of equality of plaintexts consists in proving the existence of r and 
s such that f3o = 9^, Pi = 5* and a^jai = y^yP^- 

To this aim, one chooses random a,b G Z^, and computes A = g°‘, B = g^ 
and C = j/gj/i. Then, one gets the random challenge e G Z^ from a hash function 
which is assumed to behave like a random oracle: e = H{g,yQ,yi,ao, ai,A, B, C). 
Eventually, one computes p = a — re mod q and a = b + se mod q. This proof 
can be easily verified hy A = g^^Q, B = and C = ?/gj/f (ao/ai)®, or 

equivalently by 

e = H{g,yo,yi,ao,ai,gPpQ,g'^P^^,yl^y1{ao/aiy), 

where the proof consists of the tuple (e, p, a). 

The decryption process is straightforward, using the same technique as pre- 
sented above, but twice, after having checked the validity of the ciphertext. 



Security Analysis. The basic threshold El Gamal cryptosystem is clearly IND- 
CPA. The generic conversion makes then the new proposal to be IND-TCCA, but 
under the condition that the above proof of equality of plaintexts is simulation- 
sound. We thus have to prove it. 

First, we have to be able to build a list Q of accepted proofs for words in 
and out of the language. This can easily be done, thanks to the random oracle 
property of H: one chooses p, a and e in Zg, and defines 

H{g,yo,yi,ao,ai,gPpyg'^P^‘',y(^yi{ao/aiY) ^ e. 

Now, let us assume that with access to this list of proofs, an adversary is 
able to forge a new proof for a wrong word (pkg, pkj^, og, Oi), with probability v, 
within time t. Since everything is included in the query to the random oracle H, 
we can apply the forking lemma [39], which claims that 

Lemma 1. Let A be a probabilistic polynomial time Turing machine which can 
ask qh queries to the random oracle, with qh > 0. We assume that, within the 
time bound t, A produces, with probability v > Iqh/q, a new accepted proof for 
a wrong word (pkg, pkj^, oq, oi), {g,y(i,yi,aQ,ai-, A, B-,e] p,cf). Then, within time 
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t' < \(Sqht/v, and with probability v' > 1/9, a replay of this machine outputs two 
accepted proofs of a wrong word (pkg, pkj^, uq, ai).' 

{g,yo,yi,ao,ai-,A,B-,eo-,po,ao) and {g,yo,yi,ao,ai; A, B;ei; pi,(7i), 

with eo ei mod q. 

Let us assume that the adversary has not broken the collision intractability of 
H, then 

gPof3^,o = gP-131^ , 

yo“yr(«o/ai)^° = 2/o'2/r(«o/ai)®i 

and thus, 

Po = 9^, Pi = g"', and ao/«i = Voyf'', 

where 

Pi — Pa , , cTo — (^i , 

p = mod q, and a = mod q. 

eo — Cl Co — ei 

Since oq = MqPq, and ai = Miyf, we eventually get Mq = Mi, which means 
that the word is in the language, unless one has broken the collision intractability 
for H . But under the random oracle assumption, to get a probability greater than 
1 /9 to find a collision, one has to have asked more than ^/q/3 queries to H, using 
the birthday paradox, and thus 

V 3 

where r is the time required for an evaluation of H . This leads to 

< i/<48 — 

This proves the soundness of the proof system. But since this lemma still holds, 
even for an adversary with auxiliary information (the list Q), it furthermore 
proves the simulation soundness. 



4.2 The Paillier Cryptosystem 

Review of the Basic Cryptosystem. The Paillier cryptosystem is based on 
the properties of the Carmichael lambda function in Z* 2 - We recall here the 
main two properties: for any w G Z* 2 , 

= 1 mod n, and = 1 mod 



Let n be an RSA modulus n = pq, where p and q are prime integers. Let g be 
an integer of order na modulo n^. The public key is pk = {n,g) and the secret 
key is sk = A(n). To encrypt a message M G Z„, randomly choose a; G Z* and 
compute the ciphertext c = mod n^. To decrypt c, compute 



M = 



mod n^) 

^ , w , ^ ^ mod n, 

L(gAn) mod n^) 



where the L function takes elements from the set Un = {u < \ u = 1 mod n} 

and computes L{u) = {u— l)/n. The semantic security is based on the difficulty 
to distinguish n**' residues modulo n^. We refer to [35] for details. 
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IND-CPA Threshold Version of Paillier Cryptosystem. We recall that 
A = i\ where f is the number of servers. 

Key Generation Algorithm. Choose an integer n, product of two safe primes p 
and q, such that p = 2p' + 1 and q = 2q' + 1 and gcd{n,(p{n)) = 1. One can 
note that the safe prime requirement can be avoided [20] using Shoup protocol 
[47] without using safe primes. This allows to fully share Paillier cryptosystem 
from the key generation protocol to the decryption process as it appears difficult 
to generate RSA moduli with safe prime modulus using [10]. However, for the 
clarity of the description we use RSA moduli with safe primes. Set m = p'q' . Let 
/3 be an element randomly chosen in Z* . 

The secret key sk = /3 x m is shared with the Shamir scheme [46] modulo 
mn. Let u be a square that generates with overwhelming probability the cyclic 
group of squares in Z* 2 . The verification keys vk^ are obtained with the formula 
mod . 

Encryption Algorithm. To encrypt a message M, randomly pick x G Z* and 
compute c = g^x" mod 

Partial Decryption Algorithm. The player Pi computes the decryption share 
Ci = mod using his secret share sk^. He makes a proof of correct de- 

cryption which assures that mod and mod have been raised to the 
same power sk^ in order to obtain cf and vk^. 

Recovery Algorithm. If less than t + \ decryption shares have valid proofs of 
correctness the algorithm fails. Otherwise, let S' be a set of t -I- 1 valid shares and 
compute the plaintext using the Lagrange interpolation on the exponents (which 
is possible since exponents are multiplied by Z\ = f!, and thus no modular root 
extraction is required.) 

In [19], they proved the following theorem. 

Theorem 2. Under the decisional composite residuosity assumption and in the 
random oracle model, the threshold version of Paillier cryptosystem is IND- 
TCPA against active but non-adaptive adversaries. 

Even if their definition of threshold security (the partial decryption oracles be- 
havior) is not the same, the security result still holds within our model. 

IND-CCA Threshold Version of Paillier Cryptosystem. We can therefore 
apply previous twin conversion. 

Key Generation Algorithm. Choose, for j = 0, 1, an integer Uj, product of two 
safe primes pj and qj . Set rrij = {pj — 1 ) {qj — 1)/4. Let j3j be an element randomly 
chosen in Z* . 

rij 

The secret keys sk^ = /3j x mj are shared with the Shamir scheme [46] modulo 
rUjUj. Let Vj be a square that generates all the cyclic group of squares in Z* 2 . 

The verification keys vk^j- are obtained with the formula v^ mod n^. 
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Encryption Algorithm. To encrypt a message M, randomly pick Xj G and 
compute Oj = modn^. Furthermore compute a proof that Uq and ai 

encrypt the same value: Let r be a randomly chosen element in [0, A[, and ran- 
dom elements aj G Z*^. Compute yj = mod n|. Let e be the hash value 

H{go, gi,ao,ai,yo,yi) where iL is a hash function which outputs values in the 
range [0, B[. Then, compute 2 = r-|-e x M, Uj = ajx'j mod nj A proof of equality 
is the tuple 

(e,z,uo,ui) G [0,B[x[0, A[xZ*^ x Z*^ 

It is checked by the equation 

e = H{go,gi,ao,ai,gQUQ ° /qq mod nl,glu^^/al mod nl) 

The decryption process is the same as in [19]. Furthermore, the above proof 
can be shown to be simulation-sound, using the same technique as for the 
El Carnal scheme, thanks to the forking lemma [39]. 

It is amazing to note that the Generic Conversion of Paillier cryptosystem 
keeps the homomorphic properties, namely that E{Mi + M 2 ) = S{Mi) x S{M 2 ) 
and S{M)^ = £{kM). For example, in voting scheme, such as [15,1], the au- 
thority can check the universally checkable proofs of validity of ciphertext and 
compute the tally. However, the result will no longer be a ciphertext that with- 
stands CCA. 

5 Conclusion 

In this paper we have constructed generic conversions to threshold cryptosystems 
secure against chosen-ciphertext attacks from any cryptosystems secure against 
CPA. We have proposed the first version of threshold cryptosystems CCA-secure 
which rely on the factorization problem. A new version of Paillier cryptosystem 
based on a new assumption related to RSA appears in [13]. By applying our 
techniques, one can also share this cryptosystem under their new assumption. 
This provides the second threshold cryptosystem secure under CCA based on 
RSA. 

However, as it is noted in [48], it appears to be difficult to share RSA. It seems 
even difficult to share OAEP-RSA without redundancy, which is a cryptosystem 
which achieves IND-CPA, but in the random oracle model. Indeed, the proof of 
membership appears to be odd and not practical. 
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Abstract. We study the problem of Oblivious Polynomial Evaluation 
(OPE). There are two parties, Alice who has a polynomial P, and Bob 
who has an input x. The goal is for Bob to compute P{x) in such way that 
Alice learns nothing about x and Bob learns only what can be inferred 
from P{x). Previously existing protocols are based on some intractability 
assumptions that have not been well studied [15,14], and these protocols 
are only applicable for polynomials over finite fields. In this paper, we 
propose efficient OPE protocols which are based on Oblivious Transfer 
only. Unlike that of [15], slight modifications to our protocols immedi- 
ately give protocols to handle multi-variate polynomials and polynomi- 
als over floating-point numbers. Many important real-world applications 
deal with floating-point numbers, instead of integers or arbitrary finite 
fields, and our protocols have the advantage of operating directly on 
floating-point numbers, instead of going through finite field simulation 
as that of [14]. As an example, we give a protocol for the problem of 
Oblivious Neural Learning, where one party has a neural network and 
the other, with some training set, wants to train the neural network in 
an oblivious way. 



1 Introduction 

Assume that there are two parties, Alice who has a function / and Bob who has 
an input x. They want to collaborate in a way for Bob to compute f{x) such 
that Alice learns nothing about x and Bob learns only what can be inferred from 
f{x). A protocol achieving this task for any function / and any input x is called 
an Oblivious Function Evaluation protocol. The remarkable results of Yao [17] 
and Goldreich, Micali, and Wigderson [9] showed that such protocols exist, under 
some standard cryptographic assumptions. Their protocols use a Boolean circuit 
to represent the function / and then simulate the computation of this circuit in 
some oblivious way. The computational or communicational overhead of their 
protocols depends only linearly on the circuit size of the function /, which is the 
best one can expect from a complexity-theoretical point of view. However, their 
protocols are far from being practical in general, and this problem still needs 
a lot of work to be done. One line of research is to study cases when different 
representations of functions can lead to more efficient simulation. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 369-384, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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Noar and Pinkas [15] considered polynomials over finite fields. Note that any 
function from m bits to m bits can be represented by a polynomial over a finite 
field GF(2'"), but its degree could go as high as 2"* — 1. So one would like to focus 
on those functions that can be represented by low degree polynomials. This turns 
out to have several interesting applications [15,8,14,12]. The scheme proposed in 
[15] is much more efficient than the conventional way of going through oblivious 
circuit evaluation, but its security is based on two assumptions. One assumption 
is the existence of a secure Oblivious Transfer protocol while the other, pro- 
posed by themselves, is the intractability of a Noisy Polynomial Interpolation 
Problem. Bleichenbacher and Nguyen [3] later showed that this new assumption 
may be much weaker than expected and suggested the use of a possibly stronger 
intractability assumption on a Polynomial Reconstruction Problem. Still, no one 
can say how hard this problem is as it is not that well-studied. Recently, Lin- 
dell and Pinkas [14] mentioned a not-yet-published OPE protocol, which is also 
based on some newly proposed assumption. The assumption is that the Deci- 
sional Diffie-Hellman Assumption, denoted as DDH, also holds over the group 
Z* 2 , where n is the product of two large primes. Contrary to the well studied 
DDH over Z* [2], more research may need to be done before one can have some 
confidence on this new assumption. As there may be doubt on the security of 
both existing OPE protocols, a more satisfactory solution is certainly welcome. 

As in [15,14], we will focus on the case with semi-honest parties, who may be 
curious but still follow the protocol. The malicious case can be handled in some 
standard way using commitments and zero-knowledge proofs, which will only 
be briefly mentioned. We will propose three OPE protocols of different flavors. 
Compared to previous ones, the security of our first two protocols is only based 
on a well-accepted cryptographic assumption, namely, the existence of a secure 
l-out-of-2 oblivious transfer protocol, denoted as OT^. For polynomials of degree 
d over a finite field F, our first protocol uses dlog |F| invocations of OT^ while 
[15] needs {2kd+ 1) log m invocations of OTf for some unspecified integers k and 
d depending on their proposed assumption.^ Note that for the problem in 
their assumption to be intractable, at least m must be very large just to prevent a 
brute-force algorithm that tries every possibility. So, even with their additional 
security concern, their protocol is better than ours only when |F| > i.e. 

when |F| is very large. Moreover, other than carrying out OT’s, our protocol 
involves only extremely simple computation. Our second protocol is less efficient 
than our first one, but we include it here as the technique for achieving security 
seems interesting and may have other applications. Our third protocol involves 
a third party who does not collude with others but may be curious, and our 
protocol is perfectly secure, without any cryptographic assumption. Unlike that 
of [15], all our protocols can immediately handle multi-variate polynomials. 

One attractive feature of our protocols is that they can be modified very 
easily to handle floating-point numbers. This is not the case for existing OPE 

^ Actually they use 2kd + 1 invocations of 1-out-of-m oblivious transfer, denoted as 
OT™. It is known that one OTj" can be simulated by logm calls to OTf, together 
with several evaluations of a pseudo-random function [15]. 
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protocols which rely on some specific properties of finite fields. Many impor- 
tant applications in real life involve numerical computation over floating-point 
numbers, instead of over integers or arbitrary finite fields. There is no efficient 
mapping known that embeds floating-point numbers into finite fields where arith- 
metics can be carried out easily. The approach of [14] is to scale floating-point 
numbers up to integers with some book-keeping, apply some OPE protocol over 
integers, and then do a normalization to get back floating-point numbers. This 
extra work could complicate their algorithm design and slow down the perfor- 
mance a little. We show how our OPE protocols over finite fields can be easily 
modified to operate directly on floating-point numbers, and we believe that such 
protocols are more likely to have practical applications. 

In addition to computing functions obliviously, some computational tasks 
may also involve security issues and people may want to perform them in some 
oblivious way. We use machine learning as an example, and demonstrate the ap- 
plicability of our OPE protocol over floating-point numbers. Lindell and Pinkas 
[14] considered the scenario where two parties, each holding a private database, 
want to jointly construct a decision tree that classifies entries in both databases, 
using a so-called IDS algorithm. Such kind of learning is not robust to changes 
in the sense that changes to a database may cause the whole process to be run 
again. We use neural network as our learning model and consider the following 
scenario. Alice has a neural network which is trained to some degree and she uses 
it to serve the classification requests from other parties. Alice wants to keep her 
neural network secret, while others want to keep their requests secret. This is 
the task of oblivious neural computing. At some point, another party Bob with a 
set of training examples wants to help Alice’s neural network get better, maybe 
for his own good later. Alice wants to have a secure learning process so that Bob 
learns nothing from her, while Bob also wants to keep his training set secret. 
Later, other parties having their own training set can help Alice too, and Alice’s 
neural network can adapt in an incremental way. This is the task of oblivious 
neural learning. We will apply our OPE protocol over floating-point numbers, 
and derive protocols for oblivious neural computing and oblivious neural learn- 
ing. 

The rest of the paper is organized as follows. In Section 2, we give definitions 
and tools that will be used later. Three OPE protocols are proposed in Section 
3. We derive OPE protocols for floating-point numbers in Section 4. In Section 
5, we show oblivious protocols for neural computing and learning. 

2 Preliminaries 

For a positive integer n, let [n] denote the set {1, . . . ,n}. For an n-dimensional 
vector V, let Vi, for i G [n], denote the component in the I’th dimension, and we 
write V = (rii, . . . ,Vn) = (wiliefn]. Fix a security parameter r, so that numbers 
about 2“’’ are considered negligible and circuits of sizes about 2’’ are considered 
infeasible. For a distribution D over a set S, let D{i), for i G S, denote the 
probability of i according to D, and define D{A), for A C S', to be m- 




372 Y.-C. Chang and C.-J. Lu 



Definition 1. Let D and D' be two distributions over a set S. Their distance 
is defined as d{D, D') = max^cs dA{D, D'), with dA{L), D') = \D{A) — D'{A)\. 

Note that d{D, D') = ^ \D{i) — D'(i)|, which is a useful way for calculating 

d{D,D'). 

Definition 2. Let D and D' be two distributions. They are statistically indistin- 
guishable, denoted as D = D' , if d{D,D') is negligible. They are computationally 
indistinguishable, denoted as D = D' , if dA{D, D') is negligible for any subset A 
decided by a circuit of feasible size."^ 

We will assume that parties in our protocols have only circuits of feasible sizes 
for computation unless mentioned otherwise. So we will focus on computational 
security, and the default distinguishability will be the computational one. 

An important cryptographic primitive is the l-out-of-2 oblivious transfer, 
denoted as OXf . There are several variants which are all equivalent, and the one 
most suited for us is the following string version of OXf . Let F be a set. 

Definition 3. An OTf protocol has two parties. Sender who has input (xg, Xi) € 
and Chooser who has a choice c G {0, 1}. The protocol is correct if the Chooser 
learns Xc for any (xo,xi) and c. The protocol is secure if both conditions below 
are satisfied for any (xo,xi) and c: 

— Chooser cannot distinguish the distribution of Sender’s messages from that 
induced by Sender having a different value of Xi-c- 

— Sender cannot distinguish the distributions of Chooser’s messages induced 
by c and 1 — c. 

Similarly one can define OT^ for any k > 3, with Sender having k elements and 
Chooser wanting to learn one. We will use OX^, for fc > 2, to denote an assumed 
correct and secure OX* protocol. It is known that the existence of OXf implies 
the existence of OX* for any k >3 [5,15]. 

Definition 4. A protocol for oblivious polynomial evaluation has two parties, 
Alice who has a polynomial P over some finite field F and Bob who has an input 
X* G F. An OPE protocol is correct if Bob learns P(x*) for any x* and P. Lt is 
secure if both conditions below are satisfied for any x* and P: 

— Alice cannot distinguish the distribution of Bob’s messages from that induced 
by Bob having a different x*. 

— Bob cannot distinguish the distribution of Alice’s messages from that induced 
by Alice having a different P' with P'(x») = P(x»). 

We say that a party in a protocol is semi-honest if the party follows the 
protocol but may try to learn more information than he or she should. We only 
focus on semi-honest parties in this paper. The case of malicious parties can 

^ Note that for A decided by a circuit C, dA{D,D') = |Pxen[C(x) = 1] - 
P.6D'[C(x) = 1]|. 
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be handled in a standard way, using commitments and zero-knowledge proofs, 
which will only be briefly sketched for our first protocol. 

Suppose D and D' are two distributions depending on distributions E and 
E' respectively. For any possible outcome t of E and E\ let {D\E = t) and 
{D'\E' = t) denote the distributions of D and D' conditioned on E = t and 
E' = t respectively. Here is a useful lemma for showing D = D' , which will be 
used several times in our security proofs later. 

Lemma 1. D D' provided E = E' and {D\E = t) = {D'\E' = t) for any t. 

Proof. Let (7 be a circuit which outputs 1 with probabilities p and p' with respect 
to D and D' . Let pt and pj denote the corresponding probabilities with respect 
to {D\E = t) and {D'\E' = f). Let qt = Eft) and q[ = E'{f). Then 

\P-P \ = \ ^<ltPt -^<ltPt\ 

t t 

< ~ ptPt\ 

t t 

< ~ Pt\ + 

t t 

So if \qt — q't\ is negligible and each \pt — p'^\ is negligible, then |p — p'| is 
negligible. 

Some cases later have identical E and E' , and we only need to check each \pt—p'f\. 

A family H of functions from Si to S 2 is said to satisfy a pair-wise indepen- 
dent property if for any distinct a, a' G Si, 

PhGHlHo:) = h(a:')] = r^- 
P 2 I 

Let {H, H{Si)) denote the distribution of (h,h{v)) with random h G H and 
random v G Si, and let (iL, 52 ) denote the uniform distribution over E[ y. S^- 
We will use the following lemma, which is a special case of the so-called Leftover 
Hash Lemma [10,11]. 

Lemma 2. Let El he any family of functions from Si to S 2 satisfying the pair- 
wise independent property. Then d{{El, El{Si)), (iJ, S' 2 )) < a/|S' 2 |/|S'i|. 

A proof of this lemma is given in the appendix for completeness. 

3 Oblivious Polynomial Evaluation Protocols 

We will present three OPE protocols of different flavors in this section. Assume 
that both parties have agreed that polynomials are over a finite held F and 
have degrees at most d. The set of such polynomials can be identified with the 
set T = F‘^+1 in a natural way. Suppose now Alice has a polynomial P{x) = 

J2i=o ^ ^ 
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3.1 The First Protocol for OPE 

To make the picture clear, we only discuss the case F = GF(p) for some prime p. 
The generalization to GF{p^) with fc > 1 is straightforward. Let m = [log 2 |F|]. 
Each coefficient in the polynomial can be represented as Oi = 
with Uij € {0, 1}. For i S [d] and j G [m], let Vij = Note that for each 

i G [d], The idea is to have Bob prepare and 

have Alice get those Vij with = 1, in some secret way. This is achieved by 
having Bob prepare the pair (r^, + r^) for a random noise Vij, and having 

Alice get what she wants via OT^. Note that what Alice obtains is aijVij + Vij. 
Here is our first protocol, basing only on the existence of secure OTf. 

Protocol 1 

1. Bob prepares dm pairs {rij,Vij + rij)i^[d],j&[m], with each chosen ran- 
domly from F. 

2. For each pair {vij, Vij + r^j), Alice runs an independent OT^ with Bob to 
get Tij if Qij = 0 and Vij + rij otherwise. 

3. Alice sends to Bob the sum of oq and those dm values she got. Bob 
subtracts j from it to obtain P{xG)- 



Lemma 3. Protocol 1 is correct when parties are semi-honest. 

Proof. The sum Bob obtains in Step 3 is oq + + ''’b) = P{x*) + 

Lemma 4. Protocol 1 is secure when parties are semi-honest. 

Proof. First, we prove Alice’s security. Suppose P and P' are two distinct poly- 
nomials with P{xt) = P'{xG) = j/». According to Lemma 1, it suffices to show 
that for any fixed Alice’s respective message distributions D and 

D' induced by P and P' are indistinguishable. Note that the last message from 
Alice is + j ’’’ij both P and P' and can be ignored. So we focus on Alice’s 
dm messages from the dm independent executions of OT’s. For 0 < fc < dm, let 
Dk denote the distribution with the first k messages from D and the remaining 
messages from D' . Assume that there exists a distinguisher C for D and D' . A 
standard argument shows that C can also distinguish and for some 

fco- Note that Alice must select different elements from that pair in the ko’th OT, 
as otherwise the two distributions are identical. Then one can break Chooser’s 
security in OT^ when Sender has this input, because with Chooser’s messages 
for different choices replacing the fco’th message of Dkg-i, we get exactly Dkg-i 
and Dfep, which can be distinguished by G. As OTf is assumed to be secure, D 
and D' are indistinguishable, and Alice is secure. 

Next, we prove Bob’s security. Note that Bob sends dm messages to Alice 
for the dm independent executions of OT’s. Let x* yf x*, let E and E' be 
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Bob’s respective message distributions, and let Ek denote the distribution with 
the first k messages from E and the remaining messages from E' . Suppose a 
distinguisher for E and E' exists. Then it can also distinguish Ek^_i and E}~^ 
for some k^. The pairs in that /co’th OT have the forms (r,v + r) and {r' ,v' + r'), 
for some fixed v and v' and for random r and r' . Alice’s polynomial is fixed, so 
which element to choose in that fco’th OT is also fixed. Suppose Alice chooses 
the first one in that pair. Then according to Lemma 1, there is a fixed rg such 
that Ek„-i conditioned on Bob having (ro,v + rg) and Ek^ conditioned on Bob 
having {rg,v' + rg) are distinguishable. Similarly as before, one can distinguish 
Sender’s messages when Sender has (ro,u + ro) and (r’o,u' + ro) respectively and 
Chooser selects the first element, which violates Sender’s security in OTf. The 
case when Alice chooses the second one in that pair can be argued similarly, 
by noticing that the distribution (r, v + r) and the distribution (—v + r, r) are 
identical. As OTf is assumed to be secure, so is Bob. 



Theorem 1. Protocol 1 is correct and secure when parties are semi-honest. 

Note that only dm invocations of OT^ are required and they can be done 
concurrently. Also observe that if OT^ can achieve perfect security for Chooser 
(e.g. [1]) in the information-theoretical sense, then so is Protocol 1 for Alice. 

A slight modification to Protocol 1 can handle the case of malicious parties. 
The only complication is to enforce a malicious Bob to prepare dm pairs that are 
consistent in the sense that there is some x* such that Vij = 2^~^xl for every i 
and j, which can be achieved as follows. Bob sends his commitments of dm pairs 
to Alice, Alice uses OTf to have her dm choices decommitted, and Bob uses a 
zero-knowledge proof to convince Alice that those dm pairs are consistent. All 
these can be done using, for example, the methods in [13]. 



3.2 The Second Protocol for OPE 



The idea of our second protocol is to have Alice hide the random shares of her 
polynomial P among other random polynomials, have Bob evaluate all of them 
on his input x*, and then have Alice select those values corresponding to the 
shares, which sum to P(a;*). Recall that T = Let n = log |T| -|- 2r. For 

P e T and R = (Pi, . . . , P„) G T", define the function hn^p : {0, 1}” ^ T as 

hR,p{a) = P — ^ aiRi. 

ie[n] 



It’s easy to check that for any P G T, the class Pip = {hp^p : R G T”} satisfies 
the pair-wise independent property. Here is our second OPE protocol, which is 
also based on OT^ only. 
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Protocol 2 

1. Alice generates random i? G T" and a € {0,1}" and sends 

(i?i, . . . , R„, hfi^p(a)) to Bob. Let i?„+i = /ifi,p(a) and a„+i = 1. 

2. Bob generates random r € F"+^ and prepares n + 1 pairs {ri,Ri{x<i,) + 

^i)ie[n+l] ■ 

3. For pair i, Alice runs an OT{ with Bob to get if = 0 and Ri{x<^) + ri 
otherwise. 

4. Alice sends the sum of the n + 1 values to Bob. Bob subtracts r-j 

from it to get P{x^). 



Theorem 2. Protocol 2 is correct and secure when parties are semi-honest. 

Proof. The correctness is obvious because the sum what Bob obtains in Step 4 
is fi = P(x») + Bob’s security proof is almost 

identical to that of Protocol 1, so we only prove Alice’s security here. 

Fix any two polynomials P, P' G T, let D and D' denote Alice’s respective 
message distributions, and let E and E' be Alice’s respective message distri- 
butions in Step 1. According to Lemma 1, it suffices to show E = E' and 
{D\E = t) = {D'\E' = t) for each t G T. Using an argument similar to that in 
Protocol 1, one can show {D\E = t) = {D'\E' = f) for each t G T as otherwise one 
can break Chooser’s security in OT^. Note that the family Pip satisfies the pair- 
wise independent property and E is the distribution (iLp, iLp({0, 1}")). With 
n = log |T| -I- 2r = {d-\- l)m 2r, Leftover Hash Lemma [10,11] guarantees that 
the distance between E and the uniform distribution is at most a/|T| 2“" = 2“’’’, 
which is negligible. Similarly E' also has a negligible distance to the uniform one. 
So d{E,E') is negligible and E = E' . According to Lemma 1, Alice is secure. 

Note that there are (n -I- l)log|T| = 0{dm{dm r)) bits sent in Step 1, 
0{dm r) executions of OT^ in Step 3, and m bits sent in Step 4. 

3.3 A Protocol for 3-Party OPE 

Here we show how to remove the use of OT{ with the help a third party Clark. 
As a result, our protocol does not rely on any cryptographic assumption and 
is information-theoretically secure when no collusion exists. Again, we assume 
that Alice has a polynomial P G T, Bob has x* G F and only Bob learns P(x»). 
Now the security must also hold against Clark so that the messages he receives 
altogether look completely random to him; i.e., 

— Clark cannot distinguish the uniform distribution from the joint distribution 
of messages he receives from Alice and Bob. 

Note that our model is slightly different from that of Feige, Kilian, and Naor [7], 
who have Clark as the party to receive the result. Here is the protocol. 





Oblivious Polynomial Evaluation and Oblivious Neural Learning 377 



Protocol 3 

1. Bob sends random G to Alice. He also sends {x'^ = xl + ri)i^[k] 

to Clark. 

2. Alice sends random {si)o<i<k € to Bob. She also sends Oq = oq + 

So - Z)ie[fe] (“i = + Si)iG[fc] *0 Clark. 

3. Clark sends y = Og + X)i6[fc] Bob, and Bob gets P(a;*) = y — (sg + 

Si6[fe] 



Theorem 3. Protocol 3 is correct and perfectly secure provided no collusion 
exists, 

Proof. The correctness is easy to verify. What Alice or Clark receives is 
completely random. Bob receives random {si)o<i<k in Step 2, and receives 
P(x») + sg + X)i6[fe] i’^ 4, so he sees the same distribution for any 

polynomial P' with P'(a:*) = P{x»). So each party is perfectly secure as long as 
no collusion exists. 

3.4 Generalizations 

It is not hard to see that all the protocols in this section can be easily extended 
to deal with multi- variate polynomials. In particular, we can solve an interesting 
special case: Alice has a = G F” while Bob has x = G F" and 

wants to learn the inner product a • x = X)i6[n] 

We have only considered the setting where Alice and Bob have their own 
inputs and Bob gets the final result. Later we will see a variation with each 
input and output shared by the two parties. We call this computing with random 
shares. Let’s use the inner product function as an example. Suppose that Alice 
has M, V G F" and Bob has u' ,v' G F". They want to compute the inner product 
oi u + u' and v + v' , and produce random shares, one for each party, that sum 
to the inner product. This generalization can be reduced to the original problem 
in the following way. Note that {u + u') ■ {v + v') is equal to 

{u • v) + {u • v' + V • u') + {u' • v'). 

Now Alice generates a random r G F and prepares the 2(n + l)-dimensional 
vector 

a = (-r + u-v,ui,.. .,Un,vi, . . . ,u„, 1), 
while Bob prepares the 2{n + l)-dimensional vector 

a: = (1, u'l, . . . , Ml, . . . , u'„, u' ■ v'). 

Bob can obtain a ■ x = —r -b (m -I- u') • (m + v') using a protocol for the original 
problem, and each party now holds a random share of the inner product (m-|-m') • 
(v + v'). The variation for multi-variate polynomials can be handled similarly. 
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4 Oblivious Polynomial Evaluation for Floating-Point 
Numbers 

4.1 Floating-Point Number System 

We first give the definition of a floating-point number system. 

Definition 5. A floating-point number is a rational number 6 = ± bj2'^~^ 

for some m, with bj € {0,1}. Let in denote the floating-point number system 
eontaining all such numbers together with standard arithmetic operations. 

Such a floating-point number can be represented by 2m -\- 1 bits: m bits for the 
fractional part, m bits for the integral part, and 1 bit for the sign. Unlike finite 
fields, operations in a floating-point number system are not closed and errors may 
occur because of the limitation of finite precision. An underflow occurs when the 
produced number needs more bits for the fractional part, and a rounding takes 
place to convert it into the nearest number in the floating-point number system. 
An overflow occurs when the produced number needs more bits for the integral 
part, and the result is left undefined. 

When we want to hide an element v of a finite field F in our previous protocols, 
we generate a pair (r, r-\-v) with a random r G F, so that any element of the pair 
itself looks completely random. There is a slight complication for floating-point 
numbers, but it can be easily fixed. 

Lemma 5. Suppose v,v' G £ for some i and suppose k > £ -\- t -\- 1. The distri- 
butions of V -\- r and v' -\- r' with random r,r' G k have a negligible distance. 

Proof. The distance is at most 2 ( 2 >=- 2 -^)-n — 

4.2 An OPE Protocol for Floating-Point Numbers 

Assume Alice holds P{x) = where Oi G m, and Bob holds x* G rh. For 

each i, let |ai| = Xj=i aij2™~fl with mj G (0, 1}. All our previous protocols can 
be easily modified for floating-point numbers, and here we only demonstrate one, 
which comes from Protocol 1. We will use OT!^, which can be implemented by 2 
executions of OT} [15]. Let k = {d-\- l)m -I- t -|- 1 and n = k-G log(2dm). Parties 
agree on the floating-point system k for random numbers, and the floating-point 
system h for all arithmetics so that no underflow or overflow will ever occur. Let 

v,j = 2 ™-%{. 

Protocol 4 

1. Bob prepares 2dm 3-tuples (rij,Vij -Gvij, —Vij -\-rij)i^[d],j^[ 2 m]: with each 
rij chosen randomly from k. 

2. For each 3-tuple -\- rij, —Vij -\- r^), Alice runs an OTf with Bob 

to get rij if Oij = 0, Vij -\- rij if = 1 A > 0, and —Vij -\- rij otherwise. 

3. Alice sends to Bob the sum of ag and those 2dm values she got. Bob 
subtracts Xi j from it to obtain P{xfl). 
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Note that all the arithmetic are carried out in the system h, which is large 
enough to guarantee that no error ever occurs. Then it’s not hard to verify the 
correctness of this protocol, while its security is guaranteed by the following. 

Lemma 6. Protocol 4 is secure when parties are semi-honest. 

Proof. Alice’s security proof is almost identical to that of Protocol 1, so we only 
discuss Bob’s security here. Let G m, let E and E' be Bob’s respective 

message distributions, and let Ek denote the distribution with the first k mes- 
sages from E and the remaining messages from E' . Suppose E^^_i and E]^^ can 
be distinguished, for some fco, and the 3-tuples in that /co’th OT have the forms 
(r, V -\- r, —V -b r) and (r', u' -|- r' , —v' r'), for random r and r' and for some 

fixed V and v'. Let i = {d-\- l)m and note that v,v' G i because G £ for 

any x € rh, i € [d] and j G [2m]. Then according to Lemma 5, no matter which 
element Alice chooses, the two distributions of that element have a negligible 
distance. Using Lemma 1 and adapting Bob’s security proof for Protocol 1, one 
can show that E and E' are indistinguishable. 

Note that the generalizations discussed in Section 3.4 also hold for floating- 
point numbers, and we have the following theorem. 

Theorem 4. Oblivious protocols exist for the problem of multi-variate polyno- 
mial evaluation (with random shares) over floating-point numbers. 

5 Oblivious Neural Learning 

5.1 Neural Computing and Learning 

There are several variants of the neural network model. We only demonstrate our 
result via 2 -layer feedforward neural networks with back-propagation learning. 
Other variants can be handled similarly. 

A 2-layer feedforward neural network has an internal layer of J nodes, with 
the j’th node having a weight vector uj = (uji, . . . ,Uji), and an output layer 
of K nodes, with the fc’th node having a weight vector Wk = {wki, . . . ,Wkj). 
Each node is associated with an activation function f{z) = atanh( 6 z) (the hy- 
perbolic tangent function). The network takes an input vector x = {x\, . . . ,xj) 
and produces an output vector o = (oi, . . . , o^) in the following way. 



Neural Computing 




1. Compute yj = f{uj ■ x), for j G [Jj. Let y = (j/i, . . 

2. Compute Ok = f{wk • y), for k G [K]. 





The output vector o may not be correct, and a learning algorithm adjusts 
the weights according to how the vector o differs from the correct output vector 
d. The pair (x, d) constitutes a training example. The back-propagation learning 
(BP-Learning) algorithm adjusts the weights in the following way, with 7 being 
some learning constant. 
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BP-Learning 

1. Compute Sok = ^{dk ~ Ok){a^ - o|), for k G [K], 

2. Compute Syj = - i/|) dokWkj, for j G [J]. 

3. Update Wkj = Wkj +jSokyj, for k G [K],j G [J]. 

4. Update Uji = Uji + '-jSyjXi, for i G [I],j€ [J]. 



The process above can be repeated for a set of training examples. 



5.2 Oblivious Neural Computing and Learning 

Now we want to carry out neural computing and neural learning in an oblivious 
way between two parties, Alice and Bob. Oblivious neural computing can be 
defined in a way similar to oblivious polynomial evaluation, except with Alice’s 
polynomial replaced by a neural network. For oblivious neural learning. Bob has 
a set of training examples and wants to train Alice’s neural network so that 
Bob knows nothing about Alice’s neural network while Alice knows only what 
is implied by the weight changes. We need to be careful about Bob’s security, as 
Alice’s neural network has IJ + JK weights and that many weight changes may 
reveal a lot to Alice. So we do not let Alice know the weight changes induced 
by each training example, and only let her get the overall weight changes after 
the training of all examples. Now a learning protocol is secure for Bob if Alice 
cannot distinguish two training sets that give the same overall weight changes. 
Note in practice, neural learning typically involves large training sets. 

Another scenario is for Bob to keep random shares of those final weights, as 
long as he is willing to help Alice serve requests from other parties for oblivious 
neural computing. Later when another party wants to continue the training 
of Alice’s neural network. Bob only needs to help with his shares for the first 
training example, and his duty is off after that. Contrary to the previous scenario, 
Alice cannot learn anything about Bob’s training set in this way. 



5.3 Oblivious Activation Function Evaluation 

Here we discuss options for evaluating the activation function f(z) = 
atanh{bz) = a(l — ypfjsv) in an oblivious way. We will rely on an protocol 
for oblivious circuit evaluation [17,9,16], denoted as OCE, which is efficient for 
small circuits. Assume that Alice has x while Bob has y, and they want to 
generate random shares of f{x + y) for Alice and Bob. One way is to use an 
OCE directly, if one can accept that the circuit for / is reasonably small. For 
cases allowing a large b, f{z) is close to the threshold function, which has a very 
simple circuit, and again we can use OCE directly. Otherwise, we will approxi- 
mate / in a piece-wise way by low degree polynomials and then apply our OPE 
protocol for it, which is described in the following. As / is smooth, there are 
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intervals /q = (— oo,^o])-fi = (^o, ^i]> • • • > = {(-n-iiOo), and degree-d polyno- 
mials Pq,Pi,. . . ,Pn such that 



f{z) « Pi{z) for z G h, 

for some small n and d, which seem good enough for practical purposes.^ Let I 
be the function such that I{z) = i for z G /j, which has a rather simple circuit 
and thus an efficient OCE protocol. Let Pi,x{y) = Pi{x + y). Here is the oblivious 
protocol for evaluating the activation function. 



Protocol 5 

1. Alice generate random ri. Bob runs OCE with Alice to get r 2 = I{x + 

y) - O- 

2. Alice generate random si and prepares the polynomial 



Qx{a,y) 



-Si 

2=0 



-j) 



Pi,x{y)- 



Bob runs OPE with Alice for = Qx{i’ 2 ,y)- 



Note that Alice has si and Bob has S 2 with si -b S 2 = Pi{x + y) for x + y G li, 
so the protocol is correct. The security proof is again similar to previous ones. 



5.4 Oblivious Neural Algorithms 

First we need to determine the possible range of floating-point numbers that can 
ever occur during computation. Then we can determine an appropriate floating- 
point number system k for random numbers and a system n for error-free arith- 
metics. Here is the protocol for oblivious neural computing which uses the OCE 
and OPE protocols with random shares. 

Protocol 6 

1. For j G [J], Alice and Bob compute random shares Sji,Sj 2 of the inner 
product Uj • X, and then compute random shares yji,yj2 of yj = /(sji + 
Sj 2 )- Let y= (yi,..., 2 /j). 

2. For k G [K], Alice and Bob compute random shares tfci, 1^2 of Wk • y, and 
then compute random shares Oki,Ok2 of Ok = f{tki + tfc 2 )- 



At the end, Alice can send her shares Oki to Bob for him to obtain the output 
vector o. This is not needed for oblivious learning. Note that the protocol still 
works when the each weight vector is shared by two parties instead of owned by 
Alice, which is the case in oblivious learning. 

® For example, the error can be bounded by 2 x 10“® with n = 9, d = 9, £o = —7, 
£g = 7, Po = —1, and Pg = 1. 
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Theorem 5. Oblivious neural computing can he achieved by Protocol 5. 

Proof. The correctness is easy to verify. The security relies on the security of 
the protocol for oblivious polynomial evaluation with random shares and the 
protocol for oblivious evaluation of the activation function. Any breaking of 
Protocol 5’s security gives a way for breaking one of the protocols which has 
been shown to be secure. 

An oblivious neural learning protocol can be derived similarly. Now only the 
protocol for OPE with random shares is needed. 



Protocol 7 

1. Alice and Bob compute random shares of each 6ok = ^{dk — Ok){a^ — o^)- 

2. Alice and Bob compute random shares of each 6yj = — 

yj)Y^k=l dokWkj. 

3. Alice and Bob compute random shares of each Wkj = Wkj + 'ySokVj- 

4. Alice and Bob compute random shares of each Uji = Uji + jSyjXi. 



The learning process can be repeated for a set of training examples. At the 
end of the whole process, Bob reveals his shares of those weights obtained in the 
last iteration, and Alice derives the resulting neural network. The correctness is 
easy to verify. The security can be proved similarly as before. Now Alice cannot 
distinguish among training sets that give the same overall weight changes. So we 
have the following theorem. 

Theorem 6. Oblivious neural learning can he achieved by the combination of 
Protocol 6 and Protocol 1. 

As discussed before, an alternative scenario is not to have Bob give away his 
final shares to Alice, but for him to help Alice for her future task. In this way, 
Alice only obtains random shares of her new weights after each training example, 
including the final one. So each training example is secure and now Alice learns 
nothing about Bob’s training set. 

Acknowledgements. We would like to thank Prof. Yuh-Dauh Lyuu for his 
help. 
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A Proof of Lemma 2 

Let I = |iJ||S' 2 |. From Cauchy-Schwartz, 'Ylih.v |Ps,«[(5>5(^)) = (^:^)] ~ 1/^1 is 
at most 



= {h,v)] - 1/if 



h,v 



= hjfPgAig^giu)) = {h,v)]^ -2+1 

V 
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= \J (•'P h,h' ,u,u'[{h,h{u)) = {h',h'{u'))] - 1 
= ^Ji'Ph,h'[h = h']Ph,u,u’[h{u) = h{u')] - 1 
< = u'] + Ph,u,u'[h{u) = h{u')\u ^ u']) - 1 

= VI ^ 2 |( 1 /|^ i | + 1 /|^ 2|)-1 
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Abstract. We study the two-party commitment problem, where two 
players have secret values they wish to commit to each other. Traditional 
commitment schemes cannot be used here because they do not guaran- 
tee independence of the committed values. We present three increasingly 
strong definitions of independence in this setting and give practical proto- 
cols for each. Our work is related to work in non-malleable cryptography. 
However, the two-party commitment problem can be solved much more 
efficiently than by using non-malleability techniques. 



1 Introduction 

We consider the scenario in which two players have some private values in mind, 
and want to commit these values to one another. In these circumstances, simply 
using commitment schemes on each side does not provide sufficient security. 
While this approach guarantees that the two commitments will each be hiding 
and binding, it does not guarantee their independence. 

For example, if Alice is selling something to Bob, and commits to a (her 
lowest price) by publishing c(a), then Bob can commit to a as his highest bid 
(without knowing the value), by copying c(a). Thus, Bob will force Alice to 
always sell at her lowest price. Though this is an obvious and easily preventable 
attack, more sophisticated ones exist. For example, if the commitment scheme 
being used is that of Pedersen [Ped91], Bob could, without risking detection, 
copy (or indeed add an arbitrary constant to) Alice’s value. 

Independence of committed values is quite fundamental to secure two-party 
protocols. Indeed, in any protocol to which both parties have inputs which they 
are unwilling to reveal at the outset, the inputs must be committed (so that the 
parties cannot change their minds later) and independent (so that each party’s 
influence of the outcome is limited to the choice of its own input) . 

The two-party commitment problem. In our setting, Alice and Bob have 
secret values a and 6, respectively. They want to commit their values to each 
other. Informally, we want the following security properties to hold: 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 385-401, 2001. 
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— Hiding: A dishonest party cannot discover the honest party’s value. 

— Binding: A dishonest party cannot open his or her commitment in more than 
one way. 

— Non- correlation: A dishonest party cannot commit to a value that is in some 
significant way correlated to the honest party’s value. 

We formalize the last property in three increasingly stronger definitions. 

— Mutually independent announcement. Non-correlation is guaranteed given 
that the parties open their commitments. 

— Mutually independent commitment: Non-correlation is guaranteed once the 
commitments are exchanged and accepted. 

— Mutually independent and aware commitment: Each party is guaranteed to 
know his or her own value once the commitments are exchanged and ac- 
cepted. This property, combined with the hiding property of the commit- 
ment, actually guarantees non-correlation. 

We also give practical protocols that satisfy these definitions. Specifically, we 
give a two-round^ mutually independent announcement protocol based on the 
existence of one-way permutations. We give two mutually independent commit- 
ment protocols: a two-round protocol based on the assumption that subexpo- 
nentially hard one-way permutations exist, and a three-round protocol based on 
the assumption that dense cryptosystems exist. Finally, we give a seven-round 
mutually independent and aware commitment protocol based on the discrete log- 
arithm assumption. With the exception of an eleven-round mutually independent 
and aware protocol we present to elucidate the definitions, all the protocols we 
present are efficient enough to be useful in practice. 



1.1 Mutual Independence versus Other Notions 

Mutually independent commitments provide a new approach to an important 
cryptographic problem: how ensure that secret and committed values are inde- 
pendent. This problem has been addressed before in other settings. 



Independence in the Multi-party Setting. In protocols involving more 
than two parties, it has long been recognized that independence of committed 
values is fundamental to the very notion of security: a player who can correlate 
his input to those of other players (without necessarilly knowing them) may 
be able to change the outcome of the protocol in his favor. In that setting, 
the problem of independence was introduced by Chor, Goldwasser, Micali and 
Awerbuch [CGMA85], who solved it using verifiable secret-sharing protocols. 

^ The round complexity we refer to is the number of rounds required for the commit 
stage of the protocols. All the protocols we present have one-round reveal stages, 
except the mutually independent announcement protocol, which has a two-round 
reveal stage. 
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Subsequent impovements to their solution were made by Chor and Rabin [CR87] 
and by Gennaro [Gen95]. 

Our two-party setting, while similar to the multi-party setting at first glance, 
is actually quite different: the multi-party protocols assume that a majority of 
players are honest. This allows “committed information” to actually be dis- 
tributed among multiple players. Because we have only two players, we cannot 
assume an honest majority without trivializing the problem. Thus, each player 
in our setting will have all the committed information from the other player. 



Non-malleability of Commitment Schemes. It has also long been recog- 
nized that the hiding property of a commitment scheme (carried out between two 
parties, a sender and a receiver) does not prevent an adversary from committing 
to a value related to someone else’s commitment. In the two-party commitment 
setting, the notion of non-malleability was introduced to address this problem. 

Defined by Dolev, Dwork and Naor [DDNOO], non-malleability for commit- 
ment schemes captures the following intuitive notion: if Alice commits a value to 
Bob, and Bob commits a value to Gharlie (using the same commitment scheme), 
then Bob’s committed value should be independent of Alice’s. Thus, this is a 
setting with two honest parties (Alice and Gharlie) who are unaware of each 
other, and one adversary (Bob). Because Alice and Gharlie are unaware of each 
other. Bob can arbitrarily vary the timing of the two interactions in which he is 
involved. This, in particular, implies that Bob can always just copy Alice’s com- 
mitted value by simply being a “transparent intermediary.” Gopying committed 
values is, in fact, explicitly permitted in the definiton of [DDNOO]. 

In our setting there are three crucial differences. First, Alice and Gharlie are, 
in a sense, the same person. (This, in particular, prevents Bob from arbitrarily 
scheduling the exections of the two commitment protocols and thus copying the 
committed value.) Second, we are not restricted to using the same commitment 
scheme for the two commitments. Finally, either party in our setting can be the 
adversary, and independence needs to be ensured both ways. 

While, as we describe in the next section, non-malleable commitment schemes 
may be used to provide mutually independent commitments, the mutually inde- 
pendent commitment problem can be solved more efficiently in other ways. 



1.2 Relevance of Prior Solutions 

As we pointed out above, solutions in the multiparty setting seem inapplicable to 
our setting, because they assume an honest majority of players. Non-malleable 
commitments, on the other hand, can address our problem. 

In fact, any commitment protocol non-malleable with respect to commitment 
(i.e., in which non-malleability is assured even if the adversary never sees Alice’s 
decommitted value) can be used to provide mutually independent commitments: 
simply run two copies of the protocol in parallel, one from Alice to Bob and the 
other from Bob to Alice. Either party can detect if the other is copying the 
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transcript, and thus prevent it from copying the commitment^. However, only 
one non-malleable commitment protocol is known that does not require extra 
set-up assumptions: the one of [DDNOO]. It is quite impractical, requiring a non- 
constant number of rounds (it will, however, achieve mutually independent and 
aware commitments in our setting). In contrast, we present simple constant- 
round protocols that solve the problem. 

A number of much simpler non-malleable commitment schemes (constructed 
using either non-malleable encryption [DDNOO, CS98,Sah99] or directly [DI098, 
FF00,DKOS01,CF01j) are known, all requiring trusted public file to be set up 
ahead of time. Because we are interested in a two-party scenario, we are unwilling 
to assume the existence of trusted public parameters. 

Moreover, some of the above commitment schemes [DIO98,FF00,DKOS01] 
achieve only a weaker security notion called non-malleability with respect to 
opening. That is, it may be possible for Bob to commit to a value related to Al- 
ice’s, but he won’t know how to open it. Using such a protocol in our setting will 
achieve mutually independent announcement but not necessarily commitment 
(in particular, some of the schemes are perfectly hiding, and then it is unclear 
how the committed value can be defined prior to opening). 

1.3 Applicability of Mutually Independent Commitments 

The protocols we present are for the two-party model. We intend our notions to 
be useful as essential building blocks in secure two-party computation protocols. 

As we have discussed, the question of mutual independence also naturally 
arises in the multi-party setting, and has been previously studied. By focusing 
on the two-party case exlusively, we obtain protocols that are more efficient 
and conceptually simpler. Applicability of our techniques in other settings is a 
subject of further study. 

2 Definitions 

2.1 Notation 



Negligible functions. The expression negl(/c) is used to denote any function 
/ that is negligible in k; that is, for any positive polynomial q, f{k) = o{l/q{k)). 

Probability If 5* is a probability space, then “x ^ S'” denotes assigning 
to X an element randomly selected according to S. If U is a finite set, then the 
notation “x <— A” denotes the algorithm that chooses x uniformly from F . 

^ The original definition of [DDNOO] did not prevent Bob from copying the committed 
value while not copying the transcript. However, any non-malleable commitment 
scheme can be modified to preclude this possibilty [KOS] , thus leaving Bob only one 
way to copy the commitment: by copying the transcript exactly. 

® This notation closely follows that of [BDMP91] and [GMR88]. 
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If p is a predicate, the notation Pr[cc ^ S;y ^ T; ■ ■ ■ : p{x, y,- • ■)] denotes 
the probability that p(x, y,- ■ ■) will be true after the ordered execution of the 
algorithms x ^ S; y ^ T; ■ ■ ■ . The notation [x ^ S;y ^ T; ■ ■ ■ : (x,y,- ■ ■)] de- 
notes the probability space over {(x, y, • • • )} generated by the ordered execution 
of the algorithms x ^ S, y ^ T, ■ ■ ■ . 

Protocols. The schemes discussed in this paper are protocols P = {A, B) run 
between two parties, A and B. Both A and B are probabilistic polynomial-time 
interactive Turing machines (ppITMs). Given (1) a security parameter 1*, which 
is available to both parties; (2) inputs (a,b), where a is private to A and b is 
private to B; and (3) random tapes (rA,rB), where is private to A and rs is 
private to B, protocol P computes in a sequence of rounds, alternating between 
A-rounds and i?-rounds. In an ^-round (respectively, B-round) only A (only B) 
is active and sends a string that will become an available input to B (to A) in 
the next i?-round (A-round). We will divide P into two stages: the commit stage 
Pc = {Ac, Be), and the reveal stage Pr = (Ar,Br) (state information for A 
and B is saved between the stages). At the end of the commit stage, Ac and Be 
will each output “accept” or “reject.” At the end of the reveal stage, Ar will 
output the value (3 that B revealed to it, which is a string or a special symbol 
“reject”, and Br will similarly output the value a. For notational convenience, 
we will assume that if the output of the commit stage is “reject,” then so is the 
output of the reveal stage — i.e., if a party did not accept the commitment, it 
will not accept its revealing, either. The terms “output of A” and “output of i?” 
shall mean “output of A/?” and “output of Br3^ 

We will also consider the situation in which one of the two parties is dishonest. 
The dishonest party, denoted by A! or B' , is also a ppITM. A dishonest party 
can, of course, simply stop the protocol before the other party produces an 
output. In such a case, for notational convenience, we will consider the honest 
party’s output to be “reject.” 

Transcripts, views, and outputs."^ Letting E be an execution of a protocol 
{A,B) on inputs (1^, a, 6, r^, i"s), we make the following definitions: 

— The transcript of E consists of the sequence of messages exchanged by A and 
B, and is denoted by TRANS'^’^(1^, a, b, rA, vr) (for notational convenience, 
we will include the outputs of A and B into the transcript); 

— The view of A consists of the triplet {1^ ,a,VA,t), where t is E's transcript, 

and is denoted by {l’^, a, b,r a, vr)', 

— The output of A is denoted by OUT^'^’'^® (a, 6, r^, re); 

— The output and view of B are defined similarly and denoted by 
VIEW^’^(o, 6, rA,re) and OUT^'^’'^® (a, 6, xa, ?"b); 

— We use the symbol • in place of va and vr in the above notation to denote 
the distribution induced on the transcripts, views and outputs when va or 
vr is selected at random. Thus, for example, TRANS'’^’^(1*, a, &, •, •)> is a 

^ We borrow much of our protocol notation from [BMM99] and [GMR85]. 
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probability space of transcripts, with probabilities induced by selecting ta 
and rB at random and executing {A,B) on (1^, a, &, rs). 

The output of each party is, of course, computed based solely on that party’s 
view. Therefore, we denote by OUT^(l^, a, r^, t) the output of A as computed 
on the particular view (note that A is not assumed to be interacting with anyone 
in this case). We use similar notation for the output of B. 

Finally, we will denote the transcript for the commit stage by tc, the tran- 
script for the reveal stage by t/j, and the combined transcript hy t = tc o tfi- 

2.2 Mutually Independent Announcement 

A protocol (A, B) is a mutually independent announcement if the following prop- 
erties hold: 

— A- completeness. If A and B are honest, then A can can commit and reveal 
her value successfully: 

Va, b 

Pr[a ^ 0UT5’-®(1'=, a, &,-,•) : 
a = a] = 1 — negl(fc) 

— A-soundness. This property prevents a dishonest B' from influencing which 
value A commits to. That is, if the honest A is interacting with a dishonest 
B'q during the commit stage and outputs “accept,” then A would only reveal 
a during the reveal stage, at least with an honest Br. 

^tc,tR,a,b,rA,rB, 

OUT^^(l^,a,r^,tc') = “accept” A 
OUTb^(1'', b, tbAc ° tu) = a =k 
a = a 

— Computational A-hiding. No adversary B' , interacting only with Aq, the 
commit stage of A, can break the GM-security [GM84] of A’s commitments: 

V(oo, oi) VS' 

Pr[u^{0,l}; 

0 ^ OUTg?’® (a„, (oo, oi), •, •) : 

z = v\ < 1/2 + negl(A:) 

— Perfect A-binding. If the commit stage Be of B outputs “accept,” then the 
reveal stage Bb will accept only one revealed value; moreover, this value 
depends only on the transcript of the reveal stage, not on the private input 
oiB-. 

ytc, b, b', rB,r'g,tR, a, a', 

OUTs^(l^,&, rs,<c) = “accept” A 
OUTs^(l^,&',rg,tc) = “accept” A 
OVA BrIi’', b, rB, tc otn) = a A 
OUTB„(1^6^r)3,^co^y = 



a = a . 
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— A-non- correlation at opening. To define non-correlation at opening, we use 

techniques similar to those used in defining commitment schemes that are 
non-malleable with respect to opening ([DDN00,DIO98,FF00]). The defini- 
tion essentially states that for any polynomial-time relation R, any adversary 
B' that engages in a protocol with ca) and then opens his committed 

value as (3, has no more chance of achieving R{a,(3) than a simulator who 
does not engage in any interaction with A at all. Note, of course, that because 
we are defining non-correlation at opening (rather than at commitment), B' 
may already know a before revealing (3, and thus may simply refuse to reveal 
depending on a. If B' refuses to reveal, A will output “reject.” There is no 
getting around the fact that B' can correlate “reject” to a. Thus, as for non- 
malleability, we explicitly require that i?(a, “reject”) = 0, so that forcing A 
to reject is not considered correlating to a better than a simulator. We call 
such polynomial-time relations allowable. Note that, unlike the definitions of 
non-malleability, we do not require that the relation be non-reflexive — that 
is, our definitions do not even allow B' to copy the commitment of A. 

VS' 35” V allowable R V efficiently sampleable V 
Pr[a ^ T>; 

f3 a, 

R{a,P) = 1] < 

Pr[a ^ T>; 

R{a, /3) = 1] -I- negl(fc) 

— B- completeness, B-soundness, computational B-binding, perfect B-binding, 
B -non- correlation at opening. Defined the same way as for the above, with 
A and B reversed. 



2.3 Mutually Independent Commitment 

Mutually independent commitments are defined the same way as mutually in- 
dependent announcements, except for the non-correlation property, which is de- 
fined as follows. 

— A-non- correlation at commitment. Because our commitments are perfectly 
i?-binding, at the end of the commit stage, there is at most one value (3 that 
a (dishonest) B' can reveal. Moreover, this value is determined uniquely by 
the transcript tc of the commit stage, provided that A outputs “accept.” 
Let Usitc) be this value, or T if no such value exists or if A outputs “reject” 
(recall that we included A’s output into the transcript, by definition). Note 
that, by B-hiding, UB{tc) is not efficiently computable, at least when B 
is honest. A-non-correlation at commitment requires that Usitc) be not 
correlated to a, for any polynomial-time relation R. Unlike the case for non- 
correlation at opening, we do not require that i?(a, T) = 0 — that is, B' 
should not even be able to correlate to a the fact that no valid decommitment 
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exists. This stronger requirement is justified in this case, because B' does 
not get to see a before deciding whether a decommitment should exist. 

VS' 35” V poly-time R V efficiently samplable T> 

Pr[a ^ V\ 

tc ^ : 

R{a,UB{tc)) = 1 ] < 

Pr[a ^ V; 

R{a, /3) = 1] -I- negl(fc) 

— B -non- correlation at commitment. Defined similarly, with A and B reversed. 



2.4 Mutually Independent and Aware Commitment 

In addition to the properties of mutually independent commitments defined 
above, we want to capture the strong notion that B, if he accepts the com- 
mitment stage, is assured that A “knows” the value she committed to. We mean 
“knowledge” in the sense of the existence of a knowledge extractor E, in the tra- 
dition of the definitions of a proof of knowledge [TW87,FFS88]. Note, however, 
that unlike proofs of knowledge, where an NP-witness y is being extracted for 
a predetermined statement x, in our case, no predetermined statement exists. 
What is being extracted — the commited-to value — is determined only by the 
transcript tc of the commitment stage. Thus, our definition gives E the view of 
the dishonest party (which includes the transcript of the conversation), and E 
has to extract, given oracle access to the dishonest party, the committed value. 

— A-awareness. Similarly to the definition of A-non-correlation at commitment, 
given a transcript tc of the commitment stage, let UA{tc) be the unique value 
a that A' can reveal, or T if no such value exists or if B output “reject” at 
the end of the commitment stage. Because the view Va'^ of A includes tc, 
we will use Ua{Va'^) to mean UA{tc)- 

3E V6 VA 

Pr[P^.^ =VIEwlF’''^(l^-,a,•,•); 
a^E^'iVA'^y, 
a = UA{VA'J : 
a = a] > 1 — negl(fc) 

— B -awareness. Defined similarly, with A and B reversed. 

Note that A-awareness, combined with B-hiding, implies B-non-correlation at 
commitment, because we can simply use E in place of the simulator S. Therefore, 
aware commitments are automatically mutually independent. 
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3 Protocols 

3.1 Mutually Independent Announcement 

Theorem 1. If one-way permutations exist, then there exists a protocol for mu- 
tually independent announcements with a two-round commit stage and a two- 
round reveal stage^ . 

Proof sketch. The protocol is simplicity itself. Let c be a perfectly binding non- 
interactive commitment scheme, which can be constructed based on any one-way 
permutation [GL89]. The commit stage consists of two rounds: 

1. Alice sends c(a) to Bob 

2. Bob sends c(b) to Alice 

The reveal stage likewise consists of two rounds: 

1. Bob opens his commitment 

2. Alice opens her commitment 

The completeness, soundness, binding, and hiding properties are easy to see. 
It is clear that A non-correlation at opening holds, since after step 1 of the 
reveal stage. Bob still cannot understand Alice’s commitment, so if he could 
correlate then he would break the hiding property of c. On the other hand, 
B non-correlation at opening holds, since Alice commits first. Thus, her only 
option is to refuse to open her commitment, which, be definition, can only hurt 
her chances of being correllated. □ 

It is worth noting that while non-malleable commitments “with respect to 
opening” can be used for mutually independent announcement, the solution they 
offer is far more complex than this. This elegantly illustrates the point that the 
problem we solve requires less security than the problem that non-malleable 
commitments solve. 



3.2 Mutually Independent Commitment 

All the remaining protocols we present have a one-round reveal stage under 
the imperfect synchronization assumption. That is, each player sends only one 
message in the reveal stage, and the order does not matter: we allow the honest 
players to not wait to receive a message before sending one. Note that we do 
not assume that the messages are actually sent simultaneously: dishonest players 
can always wait for receipt of a message before sending theirs. 

® This protocol can be modified to be based only on one-way functions, but this 
requires a 3-round commit stage: we simply use the construction of Naor [Nao91] to 
make a commitment scheme based on one-way functions, which requires a round to 
set up. 
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From this point on, when we refer to the number of rounds a protocol requires, 
we mean only the rounds in the commitment stage. 

We present two protocols for mutually independent commitment: a two-round 
protocol based on the assumption that subexponentially hard one-way permu- 
tations exist, and a three-round protocol based on the assumption that ‘dense’ 
cryptosystems exist. 



Two Round Protocol. A subexponentially hard one-way permutation is one 
for which there exists an e > 0 such that the permutation remains one-way even 
against adversaries that run in time 2” , where n is the security parameter. We 
note that, based on the current state of the art in factoring and discrete loga- 
rithm techniques, it is reasonable to assume that both RSA and exponentiation 
in a large prime-order subgroup of Z* are subexponentially hard one-way per- 
mutation with some e < 1/3 (because the best known attacks against them take 
time 

Theorem 2. If subexponentially hard one-way permutations exist, then there 
exists a two-round mutually independent eommitment protoeol. 

Proof sketch. Let c be a subexponentially secure non-interactive commitment 
scheme: i.e., one that is semantically secure against adversaries that run in time 
2” for some e > 0, where n is the security parameter (such a commitment 
scheme can be constructed based on subexponentially hard one-way permuta- 
tions). Assume that, for security parameter n, a commitment can be forced open 
in time 2" , for some <5 > 0 (this must be true for some 6, because one should 
be able to simply enumerate all the possible decommitment strings). 

Let k be the security parameter for our scheme, and K = . The protocol 

is, again, very simple: 

1. Alice commits to a using c with security parameter K. 

2. Bob commits to b using c with security parameter k, 

In the reveal stage, Alice and Bob reveal their values. It is clear that this 
scheme is complete, sound, hiding and binding. It is also clear that Alice cannot 
correlate her value to Bob’s, since Alice is bound to her value before Bob commits 
to his. On the other hand, if Bob could correlate his value to Alice’s, we could 
force open his commitment in time 2^ = 2-^'/^, and then use b to break the 
subexponentially strong semantic security of Alice’s commitment in time 2^ , 

which is a contradiction, because Alice’s security parameter is K. □ 



Three Round Protocol. This protocol assumes the existence of dense, per- 
fectly faithful cryptosystems. Following [DP92], a 6 -dense cryptosystem is de- 
fined by modifying the definition of a secure cryptosystem [GM84] as follows: 
first, we add the requirement that a public key generated by the key generation 
algorithm is distributed uniformly over {0, for some polynomial p in the 
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security parameter k; and second we require security for only a (5-fraction of 
public keys. It was observed by [DDPOO] that the assumption of the existence 
of (5-dense cryptosystems, for a non-negligible <5, is equivalent to the existence of 
(1 — e)-dense cryptosystems, for any negligible e. We will actually need the latter. 
They can be constructed based on the ElGamal cryptosystem, for example. 

Theorem 3. If dense cryptosystems exist, then there exists a three-round mu- 
tually independent commitment scheme. 

Proof sketch. Let e be a negligible function, and let (G, E, D) be a (1 — e)-dense 
cryptosystem. Let p{k) be the length of the public key for a security parameter 
k. Let c be a perfectly binding non-interactive commitment scheme. The commit 
stage is as follows: 

1. Alice generates a random p{k)-hit string, Ra, and sends c{Ra) to Bob. 

2. Bob sends c{b) to Alice, and sends a random p{k) bit string Rb to Alice. 

3. Alice computes PK = Ra 0 Rb, C = A(PK,a), and sends PK, G to Bob. 

Note that Alice does not open her commitment c{Ra) at this step. 

In the reveal stage. Bob opens his commitment to b, and Alice opens her 
commitment to Ra and reveals her value a and the random bits used to come 
up with G. Bob checks if Ra was revealed correctly, if PK indeed equals Ra®Rb, 
and if the random bits were correct. If any of these checks fail. Bob rejects. 

Completeness, soundness, binding, and B-hiding are easy to prove. A-hiding 
is proved as follows. Suppose B' is able to break the semantic security of the 
commitment of A. Then we will build a machine to break the semantic security 
of the dense cryptosystem. The machine will be given, as input, a public key 
PK and a ciphertext G. The machine will simulate A to S': it will commit to 
a random string Ra in the first round, and receive c(b) and Rb in the second 
round. In the third round, it will ignore the first two rounds, and simply send 
the PK and G that were input to it. Note that B' should not be able to tell that 
PK yf Ra © Rb — otherwise, it would be violating the hiding property of c{Ra). 
Therefore, B' will “behave the same way” as with the true A, and thus would 
break the semantic security of the ciphertext G. 

A-non-correlation at commitment is simple to prove: B' has no information 
about a at the time it has to commit to b. 

S-non-correlation at commitment is proved as follows. Suppose A' can cor- 
relate to b. Then we will build a machine M that breaks the hiding property 
(semantic security) of c{Rb), as follows. M receives a commitment c to some un- 
known value b. It will generate a key pair (PK',SK') for the encryption scheme, 
and run A', simulating B to it by sending it c and a random string Rb in 
the second round. In the third round. A' will send PK to M . M will compute 
R = PK © Rb (note that, if A' computed PK faithfully, then R = Ra), and 
R'q = PK' (BR. M will then rewind A! to the end of the first round, run it again, 
this time sending c and to A'. If A' again computes the public key faithfully, 
then it will encrypt a with PK', for which M knows the corresponding secret 
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key SK^ This will allow M to recover a, which is correlated to the unknown 
committed value b, and thus will allow M to break the semantic security of c. 

Of course, when M runs A' in this manner and A' does not compute the 
public key faithfully, then M will fail. However, if A! computes the public key 
faithfully with only a negligible probability, then the commitment of A! is invalid 
with all but a negligible probability, so AA is not correlating to b any better than 
a simulator who just outputs _L all the time. If, on the other hand, A' computes 
the public key faithfully with probability better than negligible, then M will 
break the semantic security of c with probability better than negligible, as well. 

□ 



3.3 Mutually Independent and Aware Commitment 

We present two protocols for mutually independent and aware commitment. The 
first protocol, previously known in the folklore, uses non-interactive perfectly 
binding commitments and general zero-knowledge arguments of knowledge (ar- 
guments, as opposed to proofs, are sound only if the prover is computationally 
bounded, which suffices for our case). Specifically, to minimize the number of 
rounds, we use the 5-round protocol of [FS89], which is based on one-way per- 
mutations. 

This protocol is not practical, because it uses general zero-knowledge proofs 
of NP statements. We present it here for didactic purposes: it clearly illustrates 
the notion of mutually independent and aware commitments. 

Theorem 4. If one-way permutations exist, then there exists an 11-round mu- 
tually independent and aware commitment protocol. 

Proof sketch. Let c be a commitment scheme. 

The commit stage proceeds as follows: 

1. Alice publishes a commitment c(a) to her value. 

2. Bob publishes a commitment c{b) to his value. 

3. Bob uses the [FS89] ZK argument of knowledge to prove to Alice that he 
knows how to open his commitment. 

4. Alice uses the [FS89] ZK argument of knowledge to prove to Bob that she 
knows how to open her commitment. 

This takes eleven rounds, since Bob can send c(5) and the first round of his 
proof in the same message. 

It is easy to show that this protocol is complete, binding, and sound. If it is 
not (say) A-hiding, then whatever B' breaks A-hiding can break the commitment 
scheme, because the zero-knowledge argument of knowledge can be simulated. 
Awareness follows simply by using the extractor for the proof of knowledge. □ 

The second protocol is much more efficient than the first. It requires just 
seven rounds, each of which takes only a few modular exponentiations. It re- 
lies on the hardness of discrete logarithms. In its simplest version, it assumes 
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that there exists an easily indexable sequence of “safe” primes and generators 
(Pi 7 5 i)j {P2 i 92), • ■ • , {Pk, 9 k), ■ ■ ■, one pair for every value of the security param- 
eter k, such that Pi = 2qi + 1 (where qi is a prime), 9i is a generator of the 
subgroup of order qi in Z*., and discrete logarithm is hard in that subgroup.® 
To simplify notation, we will assume the security parmater k is fixed, and will 
simply use p, g, 9 in place of Pk,9k, 9k when describing our protocol. 

With a loss of efficiency, our protocol can be modified to be based on general 
assumptions rather than the hardness of discrete logarithms. 

Theorem 5. Assuming the hardness of discrete logarithms, there exists a seven- 
round mutually independent and aware commitment scheme. 

Proof sketch. For clarity, we will present our protocol for commitments to 
single-bit messages first, and then explain how it can modified for longer mes- 
sages. Let H he a, hardcore predicate for discrete log (in particular, [BM84] prove 
that the sign of the exponent minus (p — l )/2 is hardcore). 

Let C denote a perfectly hiding trapdoor commitment scheme based on the 
discrete logarithm assumption. To be specific, we use the scheme of Pedersen 
[Ped91], in which one has two bases (generators), g and h = p“, and commits 
to a value v by publishing g^’h'^ , for a random r. The scheme is binding because 
decommitting in two different ways allows one to find a. On the other hand, the 
scheme is trapdoor because knowing a allows one to decommit to any v' . 

The commit stage of our protocol proceeds as follows. 

1 . Alice randomly generates an element ga of order q and a € Zq, and computes 
ha = gf. She sends (ga,ha) to Bob, to be used by him as bases for the 
Pedersen commitment scheme. 

2. a) Bob likewise generates gb,l3 and hb, which he sends to Alice to be used 

by her as bases for the Pedersen commitment scheme. 

b) Bob generates kb, such that H {kb) (B H {k'f) = b, and sends g^’’ and g^>> 
to Alice. 

c) Bob generates a random Cf, G Zg, computes and then commits to 

using Pedersen commitments with bases ga and ha. He sends the 
resulting commitment Ca{g^’’) to Alice. 

3. a) Alice generates ka and k'a such that H{ka) 0 H{k'a) = a, and sends 

to Bob. 

b) Alice generates a random ra, and, just like Bob, commits to 5 ”“ using 
Pedersen commitments with bases gb and hb. She sends the resulting 
commitment Cb{g'’“) to Bob. 

c) Alice generates and sends to Bob a random Ca G Zq. 

4. a) Bob generates and sends to Alice a random Cb in Zq. 

b) Bob decommits (/”'> from Ca{g'~'‘) and sends the decommitment to Alice. 

® This assumption can be relaxed by having the parties provide the parameters to 
each other; moreover, we do not need primes of the form 2qi -\- 1; primes of the form 
kiqi -I- 1, for sufficiently long prime qi, would suffice. For the sake of clarity, however, 
we do not present our protocol that way. 
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c) Bob computes db = Cakb + rb^ and sends dt to Alice. 

5. a) Alice checks the decommitment of g'^’’ and verifies that g'^'> . 

If the checks fail, she outputs “reject” and stops. 

b) Alice decommits from Cb{g''“) and sends the decommitment to Bob. 

c) Alice computes da = Cbka + and sends da to Bob. 

d) Alice sends a to Bob. 

6. a) Bob checks the decommitment of g”" and verifies that gjf = 

If the checks fails, he outputs “reject” and stops. 

b) Bob also checks that = ha- If not, he outputs “reject” and stops. 

c) Bob sends k'f^ and f3 to Alice. 

7. a) Alice checks that g^ = hb- If not, she outputs “reject” and stops. 

b) Alice checks received from Bob against g^'> that was sent to her in 
step 2. If they do not agree, she outputs “reject” and stops. Otherwise, 
she outputs “accept.” 

c) Alice sends k'^ to Bob. 

8. Bob checks k'a against g^«- that was sent to him in Step 3. If the agree, he 
outputs “accept.” Otherwise, he outputs “reject.” 

In the reveal stage, Alice reveals ka and Bob reveals kb- The value a is then 
calculated as H{ka) © H{k'a)- b is calculated similarly. 

Intuition. The following may help explain what is happening in this protocol 
with respect to Alice’s commitment (Bob’s commitment is, of course, similar). 
In step 3(a), Alice commits to her bit a by splitting it into two parts, ka and k'a- 
In steps 3(b), 4(a) and 5(c), she proves knowledge of ka using a Schnorr [Sch89] 
three-round proof of knowledge for discrete logarithms. The only difference from 
the Schnorr proof is that the initial message of Alice’s proof of knowledge is 
committed using the trapdoor commitment that Bob set up for Alice in step 
2(a), and Alice reveals that message in step 5(b), at the end of the proof of 
knowledge. Bob reveals the trapdoor for the commitment scheme in step 6(c). 
Only after Alice gets the trapdoor does she reveal k'a in step 7(c). 

The reason for not using Schnorr’s proof of knowledge directly is that it is 
not known to be simulatable. However, if the simulator knows Bob’s trapdoor, 
then we can simulate the proof. As we explain in detail below. Bob can refuse to 
reveal the trapdoor, but then the simulator does not have to reveal k'a- (We note 
that the idea of revealing the trapdoor to allow the simulation to go through has 
been used before in a number of protocols, and seems to have first appeared in 
[CDMOO].) 

Security. It should be clear that the protocol is complete, sound, and bind- 
ing. Awareness is fairly easy to show: the extractor (say, for Bob) need only run 
the protocol through where Ca is sent by Alice, and repeatedly try substitut- 
ing different challenges. If Bob ever answers two different challenges, kb can be 
recovered. Then, since Bob reveals in the protocol, b can be determined. 

The hiding property is a little more difficult to demonstrate. Suppose there 
is a B' which can find A’s secret values. There are two cases. 
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case i: In this case, with a non-negligible probability, when Bob does not 
give the correct /?, he still can still distinguish whether A was committing to a 0 
or a 1. In this case, we can break the hard-core predicate. Suppose we are given 
z = and we are asked to find H (x) with good probability. Then we first of all 
randomly choose ka and run the protocol as Alice faithfully, except that we give 
z in place of . If Bob returns the correct /3 we output a coin flip. Otherwise, 
we get Bob’s guess a and output a 0 Note that since we either output a 

coin flip, or Bob doesn’t return the correct /?, we never actually reach step 7, so 
it doesn’t matter that we don’t know x. 

case ii: In this case, when Bob does not give the correct (3, he can only 
distinguish with negligible probability. Thus, if Bob does give (3 correctly, he 
must be able to distinguish. So, since he is able to distinguish with non-negligible 
probability, he must give j3 with non-negligible probability. Thus, we run honestly 
until Bob gives (3. If it is incorrect, we output a coin flip. Otherwise, we rewind 
and give 2 in place of and fake the proof of knowledge (which we can do 
now that we have the trapdoor.) If Bob then completes the protocol, we take his 
guess a and output a 0 H{k'^. Otherwise, we output a coin flip. This will give 
us an edge in guessing the most significant bit of the discrete logarithm of z. 

Longer messages. In order to extend this protocol to longer messages, there 
are two techniques. The first is the obvious one: run the protocol many times 
in parallel (though we can collapse some rounds together: for instance, only one 
pair ga,ha is needed). We can do better than this by relaxing our assumptions 
and assuming that the discrete logarithm problem has more hardcore bits. That 
is, if A’s secret is n bits long and Hn{x) returns n hardcore bits x, then we 
simply modify the protocol so that Alice generates ka and k'a such that Hn{ka) 0 

H„{k'a) = a- □ 
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Abstract. A Zero-knowledge protocol provides provably secure entity 
authentication based on a hard computational problem. Among many 
schemes proposed since 1984, the most practical rely on factoring and dis- 
crete log, but still they are practical schemes based on NP-hard problems. 
Among them, the problem SD of decoding linear codes is in spite of some 
30 years of research effort, still exponential. We study a more general 
problem called MinRank that generalizes SD and contains also other well 
known hard problems. MinRank is also used in cryptanalysis of several 
public key cryptosystems such as birational schemes (Crypto’93), HFE 
(Crypto’99), GPT cryptosystem (Eurocrypt’91), TTM (Asiacrypt’2000) 
and Ghen’s authentication scheme (1996). 

We propose a new Zero-knowledge scheme based on MinRank. We prove 
it to be Zero-knowledge by black-box simulation. An adversary able to 
fraud for a given MinRank instance is either able to solve it, or is able 
to compute a collision on a given hash function. 

MinRank is one of the most efficient schemes based on NP-complete 
problems. It can be used to prove in Zero-knowledge a solution to any 
problem described by multivariate equations. We also present a version 
with a public key shared by a few users, that allows anonymous group 
signatures (a.k.a. ring signatures). 

Keywords: Zero-knowledge, identification, entity authentication, Min- 
Rank problem, NP-complete problems, multivariate cryptography, rank- 
distance codes, syndrome decoding (SD), group signatures, ring signa- 
tures. 



1 Introduction 

The general problem we address is the classical problem of interactive entity 
authentication. It is known since Fiat-Shamir [5] that solving this problem com- 
bined with a cryptographic hash function also allows non-interactive authenti- 
cation, for example digital signatures. 

* The work described in this paper has been supported by the French Ministry of 
Research under RNRT Project ’’Turbo-signatures”. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 402-421, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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The notion of Zero-knowledge identification has been formalized by Gold- 
wasser, Micali and Rackoff in [18]. In such a scheme a Prover proves his identity 
to a Verifier. Provided the underlying problem is difficult, we prove that there is 
no interactive strategy for the Verifier communicating with the Prover, to extract 
any information whatsoever on the prover’s secret. Several such schemes have 
been proposed since the original Fisher-Micali-Rackoff scheme (1984), and the 
most practical ones are Fiat-Shamir, Guillou-Quisquater and Schnorr schemes. 
Unfortunately they rely on problems that are (believed) not NP-hard such as 
factoring or discrete log. Still there are schemes using an NP-hard problem and 
still practical, for example PKP by Shamir [31], GLE by Stern [35] or PPP by 
Pointcheval [27]. However the most interesting schemes are in our opinion the 
schemes related to coding, as the decoding problem(s) are believed intractable 
even since the 1970s [2]. There were many proposals [34,40,20,16,4] and the best 
of them is the scheme SD by Stern [34,40]. The simplest decoding problem is 
the problem of Syndrome Decoding (SD) and consists of finding a small weight 
vector in an affine subspace of a linear space. Similarly the MinRank problem 
is a problem of finding a linear (or affine) combination of given matrices that 
has a small rank. Both problems are NP-hard. Moreover SD have withstood 
more than 20 years of extensive research on the cryptanalysis of the McEliece 
cryptosystem [22] and all the known attacks for SD are still exponential, [1,3,21, 
36,40]. MinRank in fact contains SD and thus is also probably exponential. It 
also contains the decoding problem for rank-distance codes of Gabidulin, used in 
public- key authentication scheme of Ghen [4] cryptanalysed in [37,11], and also 
used in the public- key encryption scheme GPT [14]. The MinRank problem, not 
always named so, has many applications in cryptanalysis of various schemes such 
as Shamir’s birational schemes [30,6,7] cryptanalysed by Goppersmith, Stern and 
Vaudenay solving a MinRank with a small rank. Similarly Goubin and Gourtois 
broke the TTM cryptosystem in [19]. In [32] Shamir and Kipnis reduced the 
cryptanalysis of Hidden Field Equations (HFE) scheme [24] to MinRank. 

In the present paper we present a new Zero-knowledge protocol, for Min- 
Rank. More precisely we show have to prove in Zero-knowledge an ability to 
compute (or have) MinRank solutions. We may build instances that have only 
one solution, and for those it will also be a proof of knowledge. We show that 
the scheme can also be applied to prove in Zero-knowledge a solution to any 
other problem expressed as a system of multivariate equations over a finite field. 

The paper is organized as follows: First we recall the basic requirements of a 
Zero-knowledge protocol. Then in §3 defines MinRank and studies related hard 
problems. The §4 shows how to build secure instances for practical use, evaluated 
with all the 5 attacks currently known for MinRank. In the §5 we describe key 
generation and setup of the MinRank identification which is described in §6. The 
following §7 gives proofs of completeness, soundness and Zero-knowledge. Then 
in §8 we analyse the performance of the scheme and in §8.2 we compare it to other 
schemes based on NP-complete problems. In Appendix G we compute useful 
probability distributions for ranks of matrices. The Appendix B contains various 
practical improvements to the scheme, notably reducing the fraud probability 
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form 2/3 to 1/2. Finally, the Appendix C shows that MinRank allows to achieve 
authentication and signature, for any small subgroup, of a given group of users 
sharing the same public key. 



2 Zero-Knowledge Protocols 

An interactive protocol involves two entities/strategies: the Prover (P) and the 
Verifier (V) that will be two probabilistic Turing machines. The Verifier and 
prover interact and at the end the Verifier gives an answer: Accept or Refuse. 

In known Zero-knowledge protocols, there is a possibility of fraud: a cheater 
is usually able to answer to some types of questions (for which he was prepared 
in advance) but not for all of them. The protocols are designed in such a way 
that an answer to one question gives no information (Zero-knowledge), while 
answering all the questions is proved to reveal Prover’s secret (Soundness). The 
security is in fact based on the impossibility by the Prover to predict Verifier’s 
questions. If we iterate the protocol, the global fraud probability becomes then 
as small as we want. 

A Zero-knowledge identification scheme should be: complete, sound and 
Zero-knowledge: 

Completeness. The legitimate Prover gets always accepted. 



(Computational) Soundness. An illegitimate Prover will be rejected with 
some fixed probability. We usually show the Prover that always succeeds can be 
used to extract the Prover’s secret (a knowledge extractor). 

Zero-knowledge. It is much stronger that saying the Verifier learns merely 
nothing about the secret. We demand that no Verifier strategy, can extract 
any information from the Prover, even in several interactions. It gives provable 
security against active attacks. Proofs are made by simulation using the Verifier 
as an oracle, or black-box, and therefore this definition has been called black box 
(computational) Zero-knowledge, as formalized by Goldreich, and Oren [17]: 

Definition 1 (Black box Zero-knowledge, [17]). A strategy P is told to be 
black box Zero-knowledge on inputs from S (common input) if there exists an 
efficient simulating algorithm U so that for every feasible Verifier strategy V , 
the two following probability ensembles are computationally indistinguishable: 

dc f 

- {{P,V){x)h^s = all the outputs of V when interacting with P on a com- 
mon input X G S. 

d^(f 

— {U {V){x)}x^S — the output of U using V as a black box, on x G S. 

The definition above is strong and still realistic: all well-known Zero-knowledge 
protocols are proven in this model. 
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3 The MinRank Problem 

Let Mq; Ml, . . . , Mm be some rj x n matrices over a ring R. The problem 
MinRank (r;, n,m,r,R) is to find a solution a G i?™ such that: 

Rank(y^^ aiMi — Mq) < r. 



3.1 Related Problems 

This version of the MinRank, is a generalized version of one among many NP- 
complete rank problems studied in [23] and [10]. In our scheme R will be a finite 
field GF{q). 

MinRank over a field can be defined in terms of codes: it is a decoding 
problem for a kind of subfield subcode of Gabidulin’s linear rank-distance code 
over GF{q^) [13,11,37]. Currently one of the two best known attacks to decode 
rank distance codes is based on MinRank [11,37]. Therefore MinRank is essen- 
tial to the security of Chen and GPT public key schemes [14,4,11]. MinRank 
also appears in attacks known on the HFE [32,8,10], TTM cryptosystem [19] 
and Shamir’s birational signature scheme [30,6,7]. Finally, as we show in §3.3, 
MinRank contains the SD problem for ordinary codes that underlies the security 
of McEliece [22] and various identification schemes [34,40,16,20]. 

MinRank over rings should also be mentioned. MinRank over 7L might be 
broken by the widely-used LLL algorithm. Indeed, when all the Mi are diagonal 
of size up to 300 x 300, the problem is to find a vector in a lattice with a small 
number of non-zero elements, and this problem is closely related to the well 
known lattice reduction problem that has numerous applications in cryptogra- 
phy. Still MinRank over K is undecidable in general, because it can encode any 
set of diophantine equations (Tenth Hilbert’s problem) [23]. 

3.2 Encoding NP Problems as MinRank 

The problem of proving in Zero-knowledge that a system of equations over a 
finite field has a solution has already been solved in [12] under RSA or DL 
intractability. Our solution is based on an NP-complete problem. 

Theorem 1 (Determinant Universality, Valiant 1979). Any set of multi- 
variate equations over a ring can he encoded as a determinant of a matrix with 
entries being constants or variables. 

It was first shown by Valiant [38]. For a simpler, and still effective proof 
see [23]. Both give an effective algorithm to encode any set of multivariate 
polynomial equations as a MinRank. However the size of matrices it gives seems 
hard to improve, for m equations of degree d with n variables we need matrices 
of width about mn‘^. 

From now we always suppose that R = GF{q). Solving multivariate quadratic 
equations over a field is NP-hard [26], thus: 
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3.3 MinRank Is NP-Hard 

The proof of [23], however, gives instances of MinRank in which the size of 
the matrices will be polynomial in the number of matrices. It might seem that 
MinRank is less secure with m matrices nxn and m and n being of the same order 
of magnitude. We are going to show a reduction from an NP-complete problem 
that gives instances that are known to be hard both in theory in practice, with 
m, n and r being of the same order of magnitude. We reduce from the Syndrome 
Decoding problem of a linear error correcting code that is NP-complete. The 
proof for the case g = 2 is to be found in [2], and an extension to the arbitrary 
field is sketched in [39], page 1764. Let (n, k, d) be an error correcting code. The 
encoding is trivial: each of the lines of the generating matrix will be put on the 
diagonal of a n x n matrix Mi that will have all O’s elsewhere. Similarly Mg 
contains the fixed codeword to decode. Solving MinRank with rank r is then 
equivalent to correcting r errors. 

4 MinRank Instances and Attacks 

4.1 Preliminary Requirement 

The instance of MinRank should be chosen in such a way that the probability it 
has many solutions (apart from those we might put by construction) should be 
small. One possible way of achieve this is an explicit reduction from an instance 
of another problem that has only one solution, as for example in §3.2. 

Another way is to choose parameters such that the probability it has a solu- 
tion, given in Appendix B, is small, and thus we will be able to build instances 
with one (constructed) solution that are unlikely to have (m)any more. In this 
case, as we show in section A, we need to have 

m < mraax with rrimax rjn + - {ri + n)r + 1 

4.2 Known Attacks 

We assume rj > n. ^ There are five attacks known for the problem MinRank. Let 
oj be the exponent of the Gaussian reduction 2 < w < 3, in practice w ~ 3. 

Exhaustive search. It is , see [10] for details. 

Attacking square MinRank with r w n. In some cases the exhaustive 

search may break MinRank For example we consider a MinRank with m 

matrices nxn and with r = n — s. Then we have nimax = r'^ + n'^ — 2nr = s^. A 

2 

randomly generated MinRank with such parameters can be solved in about g'* , 
which can be quite small. However if the MinRank with m » nimax is generated 
from a reduction from another problem (see §3.2) having not too many solutions, 
it is still secure. 

^ The problem is symmetric with respect to transposition of matrices with swapping 
rj and n and by inspection we verify that all the complexities given in the present 
paper are already given for the better of the two cases. 

^ This attack was suggested to me by prof. Claus P. Schnorr. 
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Attack Using Sub- matrices. This simple attack works only if r << n, not 
the case in this paper, and was first used by Coppersmith, Stern and Vaudenay 
in [6,7]. It was then described in details and used in [8,9]. 



MQ-solving Attacks. Another attack that works only for r « n is due 
to Shamir and Kipnis [32]. It reduces MinRank to the MQ problem, i.e. to a 
system of Multivariate Quadratic equations. If r << n the system is overdefined, 
and surprisingly such a system will be solved in expected polynomial time [32] . 
Improved algorithms will give roughly about see [10,11,8,33]. 

Since we will never have r << n, both these attacks fail. 



The Kernel Attack is the best attack for the parameter sets we propose. It 
is due to Louis Goubin and described in [19] with a complexity of for 

n = 77 . A more general version described in [10] and [11] gives 

, gl^Jr-e(m mod n) y ^ 

For small r there are further improvements described in [11]. 



The ’’Big in'” Attack. This attack designed for m » n and is described 
in [11] and [10]. It is trivial and consists of constraining as many entries of the 
matrix M, as possible to 0. It runs in 

^Max{0,-n{n-r)-m) ^ 



The Syndrome Attack. Another attack for to >> n and is described in [11] 
and [10]. It is not very practical and gives about 



Hard Instances: All the attacks known for MinRank described above are ex- 
ponential in general. In a work in progress, [11] it is conjectured that for fixed 
rj = n the best security of gA” is achieved with r = n/3 and to « If to is 
fixed, one may also build instances as close as we want to the exhaustive search 
if we put n > 3y/m and as big as possible, and with r = n — y/m. 



4.3 Practical Parameter Choices 

We propose six sets of parameters A-F that use square matrices (rj = n) and 
work either over GF{2) or over GF(65521), the biggest prime that fits in 16 bits. 
In the following table we compare the complexity of all known attacks described 
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above for A-F, give the communication complexity computed following §B.2, as 
well as the probability that it has a solution computed computed in §A. 

For comparison we also include two MinRank instances that appear in the 
Shamir-Kipnis attack on HFE cryptosystem ^ [32], given for the HFE Challenge 
1 [24,9] and for a subsystem of Quartz [25]. 



Cryptosystem 


MinRank identihcation 


HFE [ 


Parameter set 


A 


B 


C 


D 


E 


F 


Chalk 1 


Quartz 


m 


10 


10 


10 


81 


121 


190 


80 


103 


n 


6 


7 


11 


19 


21 


29 


80 


103 


V 


6 


7 


11 


19 


21 


29 


80 


103 


r 


3 


4 


8 


10 


10 


15 


7 


8 


q 


65521 


65521 


65521 


2 


2 


2 




^ID3— 


Pra[Rank < r] 


0.6 


0.6 


0.6 


0.6 


0.6 


2-6 


< 2-1°“ 


20 X Comm. [Kb] 


1.94 


2.99 


4.86 


2.17 


2.36 


3.13 





Attack 




Brute force 


2^5^ 


2^ 


2TVD- 






2205 


2«° 


2TO3- 


Kernel 






2l3S 






2128 


—2577 


2^44 


Big m 


-1^ 


2205 


2399 








2461/c 


2997/0 


Syndrome 






21002 






2359 


c^Ab'2,k 


2530/c 


Sub-matrices 


OO 


(X3 


OO 


OO 


OO 


OO 






MQ 


OO 


OO 


OO 


OO 


OO 


OO 


— 2T52 


2i““ 



5 Setup of MinRank Identification 

5.1 Key Setup 

The public key are 1 + m matrices 7] x n over a finite field GF{q), 
Mq; Ml, . . . ,Mm- Let r < n. To generate a random hard ® instance we pick 
1 + TO — 1 (pseudo-)random matrices Mq; Mi, . . . , M^-i- We chose a random 
M of rank r and we ’’adapt” Mm- For this we pick a random a G GF{q)'^ such 
that am ^ 0 and Mm is computed as: 

= (M + Mo - ^ aiMi)/am 



® The brute force workfactors given in the table for HFE correspond to the direct 
brute force attack on HFE itself, not on MinRank that would give much more. 

* Since it is only a subsystem, an attack on MinRank does not really break Quartz. 

® The instances of MinRank generated here are such that the matrices, and a linear 
combination that yields a small rank, are all random and uniformly distributed. It is 
believed to give hard instances most of the time with respect to all the attacks from 
section 4.2. It might change if a better way to produce hard instances is known. The 
same problem is an issue for any cryptosystem based on an NP-complete problem: 
there is a difference between an NP-complete problem in general, and the actual 
instances in the samplable distribution generated by a finite-length algorithm. 
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In practice, we generate M and M\, . . . , Mm-i out of a pseudo-random gen- 
erator with a seed of 160 bits. It is better to pick all Mi invertible, but it’s not 
necessary. We may use the well-known LU method ® to generate a deterministic 
pseudo-random invertible matrix. In order to generate M , first we generate a 
matrix L which is random invertible matrix r x r, completed with O’s to an 
r] X n matrix. Then a random couple of invertible matrices S and T is applied 
M = SLT, see Lemma 1. 



The secret key. It is the solution a G GF{q)" such that 

Rank(y^^ a* • Mj — Mq) = r. 



Key sizes. All the public key is generated out of a pseudo-random generator 
with a seed of 160 bits, except that is transmitted. The size of the public 
key is thus only 160 -I- n? 7 log 2 9 bits. The secret key requires only additional 
mlog 2 q bits to store a. 

6 MinRank Identification Scheme 

We use a collision-intractable one-way hash function H for commitments that is 
supposed to be behave as a random oracle. The Prover is going to convince the 
Verifier of his knowledge of a (and M). 

The Prover chooses two random invertible matrices S, T that are rj x rj and 
n X n, and a totally random rj x n matrix X. We call STX the triple (S', T,X). 
Then, he picks a random combination /3i of the 

iVi = 

He puts and N 2 = M + Mq + Ni and uses his secret expression of M to get: 

iV2 = ^/32^•Mi 

We have 132 — Pi = a, but each of Pi (taken separately) is random and 
uniformly distributed. Each of the W is just a random combination of the Mi. 



One Round of AfRne MinRank Identification: 

1. The Prover sends to the Verifier: 



H(STX), H{TNiS + X), H{TN2S + X - TMqS) 

® This method is known to give a slight bias, but it seems easy tor repair for example 
by multiplying a few such matrices and permuting columns. 
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2. The Verifier chooses a query Q e {0, 1, 2} and sends Q to the Prover. 

< 

Qg {0,1,2} 

3. If Q = 0 the Prover gives the following values: 



(TNiS + X), {TN 2 S + X -TMqS) 

Verification Q = 0: The Verifier accepts if 
H(TNiS + X)andH(TN 2 S + X — TMqS) are correct and if 

{TN 2 S + X- TMoS) - {TNiS + X) = TMS 

is indeed a matrix of rank r. 

3’ If Q = 1, 2 the Prover reveals: 

STX, !3q 

Verification Q = 1,2: The Verifier checks if S and T are invertible and 
H{STX) is correct. Then he computes 

TNQS = ^(iQ, TM,S 

and verifies H{TN-^S + X) or H{TN 2 S + X- TMqS). 



6.1 Completeness 

It is clear that a legitimate Prover that knows a always succeeds. 

6.2 Soundness 

We will show that a false Prover is rejected with probability Let C (Charlie 
or the Cheater), be an expected polynomial time Turing machine. We suppose 
that there is such a false Prover C that can answer all the questions Q. In fact 
the proof below shows that such a Prover will either be able to compute 
a collision for H, or be able to solve the given instance of the NP- 
complete problem MinRank 



Proof: C commits (with H) to the values oiTNiS+X and TN 2 S+X . For Q = 1 
and 2 he proves that he has indeed generated them in the form X+T{^ f}uMi)S 
and X + T{^ (i 2 iMi)S. In both cases we verify H{STX) and we are certain that 
he used the same X, S and T. Finally when Q = 0 we will verify the rank of the 
following matrix is indeed r: 

^ Here it can be just any instance of MinRank, however in the practical anthentication 
the public key is generated in a specific way, see note 1 on the bottom of page 408. 
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-TM^S + X) - {r{Y,(3uMi)S + x) = 

m 

= ^(/32^ - Pli) ■ TM,S - TMoS 
When Q = 1 or 2 we check that 5^ and T are invertible, thus 

m 

— Pli) • Mi — Mq 

is also of rank r. Thus the Prover knows a solution to MinRank a = {P 2 — Pi) , 
i.e. either the secret key a or an equivalent one. □ 

One can see that the fraud probability for several rounds is: 

fraud — ( 2 

For details and an improvement to (| 

7 Black-Box Zero-Knowledge of MinRank 

Let the Prover strategy P be a probabilistic average polynomial time Turing 
machine. We suppose that P is a random function (oracle). The simplicity of 
MinRank makes very easy to show it is Zero-knowledge. 

— In cases Q = 1, 2 we only disclose random unrelated variables S, T, Pq, X. 

— The case Q = 0: disclosing (TNiS+X) and {TN 2 S—TM 0 S+X) is equivalent 
to disclosing (TNiS + X) and their difference TN 2 S — TMqS — TNiS = 
TMS. 

Since X is completely random, (TNiS + X) is a random matrix independent 
from TMS. As for TMS, we show that it is a uniformly distributed matrix 
of rank r: 



^ grounds 

)#rounds ^ ^ 



Lemma 1. Let M he a rj x n matrix of rank r. Let S and T he two uniformly 
distributed random invertible matrices rj xrj and nxn. Then TMS is uniformly 
distributed among all rj x n matrices of rank r. 



Proof sketch: All the rj x n matrices M of rank r are equivalent modulo 
invertible variable changes and can be written as: 

I Ldrxr brx(n— r) 

\d(r;— r)xr 0(r;— r) X (n— r) 




M = S' ■ 



T 
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7.1 The Exact Proof of Zero-Knowledge by Simulation 

We construct a simulator U with oracle access to V, see Def. 1: 

1. U{V) chooses a random query Q = 1, 2. He will prepare to answer to ques- 
tions 0 and Q. 

2. He chooses N = '^5iMi with a random <5. 

3. He picks up STX = (S,T,X) with invertible S and T. 

4. He picks up a random matrix H of rank r. 

5. Let Nq = N and iVg.Q = N + {-1)^+\R+ Mq). Now N 2 ~Ni = R + Mq. 

6. He asks for Verifier’s query on his commitment: 

Q' = V {H(STX), H{TNiS + X), H{TN2S - TMqS + X)) € {0,1,2}. 

7. He repeats steps 1-6 about 2 times (rewinding), 

until he does get one of the two queries he has prepared to answer: 

Q' G {0, Q] 

8. If Q' = 0 the simulator U{V) reveals {TN 2 S + X — TMqS) and {TNiS + X) 
with indeed a difference TRS of rank r. 

8’ If Q' = Q the simulator U{V) reveals STX and <5, that were indeed used to 
construct the committed TNqS + X[—TMqS]. 

8 Performance of the Scheme 

8.1 Communication Complexity 

We assume that hash values are computed with SHA-1. Thus we need 3-160-1-2 
bits for the first two passes. 

We note that the values of STX = (S,T,X) does not need to be transmitted, 
they are in practice generated using a pseudorandom generator out of a seed 
of 160 bits, using the method we described in §5 to generate pseudorandom 
invertible matrices S and T. ® 

The last pass requires 2nr]log2 q bits in the case Q = 0. In the two other cases it 
requires 160 -I- mlog 2 q bits. The weighted average bit complexity for the whole 
scheme is 3 • 160 + 2+’^- 160 -I- |(n ?7 -I- m) log 2 q. 

This is to be multiplied by the number of rounds which is > 35 for the round 
fraud probability of 2/3. In the Appendix C we show how achieve 1/2 instead 
(which will require only 20 rounds) and present several other improvements. Our 
best scheme (cf. B.2 and B.3) gives a communication complexity as low as : 

Comm, [in bits] = 2-160-1- ^4 - 160 -1-8-1- ^ ^ rounds 

® Such modifications make the security depend on an additional assumption. It seems 
to be a quite weak and plausible assumption. For example here (S', T, X) should be 
indistinguishable from random. 
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8.2 Comparison with Other Schemes 

The following table compares different Zero-knowledge protocols based on NP- 
complete problems based on previous work of Pointcheval [28] . 





PKP 

Shamir 


SD 

Stern 


Chen [4J 

Chen 


CLE 

Stern 


PPP 

Pointcheval 


MinHank (A) 

Author 


matrix 


16 X 34 


256 X 512 


32 X 16 


24 X 48 


101 X 117 


6x6 


field 


IP 251 


IF 2 


1^65535 


IP 257 


IF 2 


IP65521 



passes 


5 


3 


5 


3/5 


3/5 


3 


impersonation 


1 


2 


1 


2 /I 
3/2 


3 / 2 
4 / 3 


2 /I 
3/2 










probability 


2 


3 


2 


rounds 


20 


35 


20 


35/20 


48/35 


35/20 


impersonation 

global 


10~® 


10“® 


10”® 


10”® 


10”® 


10”® 



public key [bits] 
secret key [bits] 


272 

128 


256 

512 


256 

512 


80 

80 


149 

117 


735 

160 


best attack 


2 ®® 


2 ™ 


253 


273 


2®i 


2106 



bits send/round 


665 


954 


1553 


940/824 


896/1040 


1075/694 


global 

[Kbytes] 


1.62 


4.08 


3.79 


4.01/2.01 


5.25/4.44 


4.6/1.94 



9 Conclusion and Perspectives 

We described a new MinRank authentication scheme. It is proven Zero- 
knowledge and relies on a linear algebra problem MinRank. This NP-hard prob- 
lem contains in a very natural way some famous problems such as Syndrome 
Decoding. Both these problems are believed hard on average and all the known 
algorithms are exponential. 

It is possible to use MinRank to prove in Zero-knowledge a knowledge of a 
solution for any problem expressed as a set of multivariate equations over a finite 
field (see 3.2). However, the encoding will not always be practical. 

Among known schemes based on NP-complete problems MinRank is one of 
the most efficient, though several schemes are not much worse. 

MinRank also allows to share the public key among several users in such a 
way that any small subgroup can identify itself or produce signatures. 

Acknowledgments. I would like to thank prof. Ernst M. Gabidulin, prof. 
Jacques Patarin, prof. Claus P. Schnorr and dr. Louis Goubin for helpful remarks. 
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A Probability Distribution of Ranks 



Following [13] the probability that a random matrix 77 x n is of rank r is 

(g" - 1) • . . . • (9" - g"-!) q^~^) 



P{rj,n,r) = 



(g" - 1) • . . . • (g" - g"-i) 



rjTin 



If r < min{n, rj) it is non-zero, and when all the n,r],r —>■ 00 we get the 
following approximation: 

p{v,n,r) ~ 



The probability that a random matrix 77 x n is of rank > r is about: 

r 

^{r}+n)s-s^-7]n-^ ~ _ ^(?/+n)r-r^ -r/n^ 

m 

There are ^ non-collinear combinations a of the Mi. The probability that 
all of them give Ranki^^ otiMi — Mq) > r with r < min{n, ij) is about: 

PVa [Rank < r] ( 77 , n, r) = 1 — (1 — q(Mn)r-r -rin^ 9-1 

We want to evaluate the value mmax such that for m < vrimax we expect to 
have solutions for a random MinRank, and such that for m ~ rumax, we expect 
to have one solution on average. Therefore: 



_ 1 ) . Q(^qiv+n)r-P-vn^ _ 
def 2 / \ -I 

^max = rjn-\-r - (r/ + n)r + 1 



B Achieving Fraud Probability 1/2 

We present a technique to achieve the fraud probability 1/2 instead of 2/3. It 
has the following interesting features: 

— It requires additional assumption (of type one-wayness of a function). 

— Should this assumption fail, the scheme is still at least as secure as before, 
only with a worse impersonation probability. 

The principle of the ’’trick” is to replace some random choices by a determin- 
istic procedure so that they are still random but cannot be chosen. We add an 
additional ’’verifiable” requirement on generation of some values, and thus we 
eliminate some fraud scenarios (but not others). Then we modify the probabili- 
ties of different questions in order to balance the probabilities for the remaining 
fraud scenarios. 

We consider any Zero-knowledge protocol in which a Prover picks up 2 values 
Pi and P2 such that P2 — Pi = a \s & given (usually secret) value. Usually we 
will generate Pi at random and compute P2, which enables fraud scenarios in 
which the adversary may chose a value for one out of Pi,P2- We want to avoid 
this. Let U be a function with a following properties: 
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(1) It is very hard to compute an inverse F~^{y) for a given random y. 

(2) It is very easy to compute two solutions x and x' such that F{x’) — F{x) is 
a given value Ay and x' = x + Ax with a given constant Ax. 

Example 1: F : x x^ mod N, N being an RSA modulus. The inversion 
problem (1) is as hard as factoring. 

Example 2: F : GF{q)'^ GF{q)'^ is a set of random quadratic equations 

over a finite field. The inversion problem (1) is called MQ, is NP-hard very 
difficult in practice [26,33]. 

In both examples, (2) is a linear problem easily solved. 

We note that each of the above examples is applied with an operation ’-h’ that 
belongs to a different group. Only the first example can be used for MinRank, 
as our ’-h’ will be the component-by-component addition in the finite field. 

B.l Application to MinRank Scheme 

Let F : GF{q)'^^ GF{q)'^^ be a public fixed random set of quadratic equations. 

In the modified MinRank scheme, the Prover picks up two 160-bit seeds Z and 
STX. Let Ay = Expand(Z) and (S,T,X) = Expand(S'TA) be the output of a 
pseudo-random generator. He solves 

/ P 2 ^Mi)S - TMoS +X)~ F(T(^ PuM,)S + X) = Expand(Z) 

^ ^ t P 2 -P 1 =a 

The first equation becomes linear in Pi after substitution of /?2 = /3i + o. He gets 
m linear equations with m variables Pu. If there is no solution {Pi,P 2 ) found, 
he tries again with a new Z . 



Verification that the Prover Follows the Scenario: If Q = 0, the Prover 
will send an additional value Z. The Verifier will check that F{TN 2 S + A) — 
TMqS—F{TNiS+X) = Expand(Z). In the previous version of MinRank scheme 
possible fraud scenarios were: 

01 Try to be able to answer Q = 0 and 1. 

It is easy to produce two matrices, seemingly T{^ PuM^S + X and 
{T{^ P 2 iMi)S — TMqS + X), such that only one of them is really con- 
structed in such a form, and the other is adjusted to get a difference of rank 

r. 

02 Try to be able to answer Q = 0 and 2 in the same way. 

12 Try to be able to answer Q = 1 and 2: We pick up any STX , Pi, P 2 and 
produce a genuine PQiMpS + X[—TMoS]. 

0 Try to be able to answer Q = 0 only. For this we just give any matrices that 
have a difference with rank r. 

1 Try to be able to answer Q = 1 only. For this we produce T(^ PuMpSF X 
in the required form. 

2 Try to be able to answer Q = 2 only. As above. 
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The new version excludes the scenarios (01) and (02). Let us see why on the 
example of scenario (01). We assume that a false Prover wants to answer Q = 0 
and 1. He may try the following possibilities: 

a. Since S,T and X are always obtained as Expand(STX), if we cheat and 
have not selected them in this way, we are only able to answer Q = 0. 

b. He may try to pick up j3i. Since F is one way (the NP-chard problem MQ), 
he will be unable to produce a matrix R such that F{Q) — F{T(^ (3iiMi)S+ 
X) = Expand(Z). 

c. Another way is to try find R of rank r and write the nt] equations with m 

variables — Mq) — {^j3uMi) = R. However to find a solution is 

hard because a = j32 — f3i would allow him to solve an instance of MinRank. 

Still an adversary has the capacity to answer all possible questions separately: 
fraud scenarios (0), (1) and (2). 

Resulting Changes in the Protocol. Now we may modify the probabilities. 
The question Q = 0 is asked with probability 1/2 and Q = 1, 2 with probability 
1/4 each. The following table shows the probabilities of success for all fraud 
scenarios. 



Fraud scenario 


0 


1 


2 


01 


02 


12 


012 


Pr [Success] before 
now 


1 

3 


1 

3 


1 

3 


2 

3 


2 

3 


2 

3 


0 


1 

2 


1 

4 


1 

4 


0 


0 


1 

2 


0 



A false Prover is detected with probability 1/2. Now only 20 instead of 35 
rounds are needed to achieve the security of 10“®. 

Note: We obtained a more efficient authentication scheme with an added com- 
putational assumption based on the NP-hard problem MQ. This problem is 
believed very hard [33], but if it wasn’t then the scenarios (01) and (02) will be 
possible again and the fraud probability will be 3/4. The MinRank scheme will 
remain secure, but with worse fraud probability, or equivalently, it will require 
more iterations. 

Further Improvements. First we remark that if Q = 0, it is not necessary 
at all to transmit the two values TN 2 S — TMqS + X and TNiS + X. In fact 
it is enough to transmit their difference TMS and Z that is already among the 
values that are transmitted. The values ofTN 2 S—TMQS+X and TNiS+X can 
be then recovered by the Verifier that has to solve a system similar to (B.l.(S)). 
We saved a transfer of one matrix rj x n. 

Another improvement is to use only one seed STXZ with: 

{S, T, A, Z) = Expand(STAZ) 

B.2 The Modified Version MinRank- v2 

Now we integrate all improvements in order to have a general view. The prover 
chooses a random seed of 160-bits STXZ. Let 
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{S, T, X, Z) = Expand(5TA:Z) 

Ay = Expand (Z) 

Now the Prover solves: 

/ ON / F{T{Y: P 2 ^Mi)S - TMoS +X)~ F(T(E PuM,)S + X) = Expand(Z) 

^ ^ I /32 - /3i = a 

If there is no solution, (/3i,/32), we try again a small number of times, with a 
different seed STXZ. Then in each round of authentication: 

1. The Prover sends to the Verifier: 

H(STXZ), H{TNiS + X), H{TN2S + X - TMqS) 

2. The Verifier chooses a query Q, such that Q = 0 with probability 1/2, and 
Q G {1,2} with probability 1/4 each. He sends Q to the Prover. 

< 

Qg {0,1,2} 

3. If Q = 0, the Prover gives the following values: 

^ 

TMS, Z 

Verification Q — 0: The Verifier will compute the {TNiS + V) and 
{TN 2 S + X - TMoS), see B.l Then he will accept if H{TNiS + X) and 
H{TN 2 S + X — TMqS) are correct, and if Rank{TMS) = r. 

3’ In the case Q = 1,2, the Prover reveals: 

STXZ, (3q 

Verification Q= 1,2: The Verifier checks if S and T are invertible and if 
H{STXZ) is correct. Then he computes 

TNQS = ^(iQ, TM,S 

and verifies the correctness of H{TNiS + X) or H{TN 2 S + X — TMqS). 

B.3 Improvements in the Communications 

As in 8.1 we compute the communication complexity of the new version. By 
inspection we see that it becomes: 



TlTl Tfi \ 

3 • 160 -1-2-1 ^ log 2 q j • #rounds 
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Remark: The value of 160 bits for a length of seeds and commitments is ap- 
propriate for the security level of and should be increased otherwise. For 
example for a security level we should use 2SF bits. So we get 

{&SF -h 2 -h log 2 ■ #rounds 

Chaining random seeds. It is also possible to save on the size of random 
seeds used in the scheme and use one single seed of 2SF bits for the whole 
scheme. Each time we compute a seed Ai as the following: 

Ai = F[{A(i\\i\\bi , ... , 67 ) 

with an appropriate length hash function and with 7 random bits bi, as the seed 
STXZ = Ai will only work in sec. B.2 with a probability different than 1. Thus 
we may try again for bi in order to have a working seed. With 2^ = 128 tries we 
have a negligible probability to never find an appropriate seed. The main seed 
Aq is only given at the end, after all rounds of authentication, and only then all 
the verifications are carried. Now, with the exception of Aq, each round requires 
only 4:SF -|- 7 -I- 2 -|- log 2 q bits. Thus we get a communication complexity 
of 



2SF + y4SF -1-9-1- ^ ^ qj . grounds. 

C Group Authentication/Signatures with MinRank 

It is easy to produce almost totally random instances of MinRank with sev- 
eral users, each of which has one solution to MinRank and no information about 
other solutions. We pick 1 -|-m [pseudo-] random matrices Mq; Mi, . . . , M^. Each 
user i has the right to pick up a matrix Ui such that Ui — Mg, plus some ran- 
domly chosen linear combination of the Mi . . . M^, has a small rank. It can be 
done for an unlimited (in practice) number of users. Then the set of matrices: 
Mg; Ml, . . . , Mm', with the {Ui\i G G} is the public key for any small ® sub- 
group G. Now any member of the group G, can use the MinRank authentication 
scheme to anonymously prove his membership. 

C.l Ring Signatures with MinRank 

A well known method (see [5]) that transforms a Zero-knowledge protocol into a 
signature scheme will also apply to MinRank. This in turn can be combined with 
the above multi-user setting. We obtain an anonymous group signature scheme 
known as a ring signature scheme [29], with the following properties: 

® Here the total number of matrices m can be very big: attacks such as the ’’big m 
attack” described in §2 or in [11,10] will only apply to a smaller m' , the maximum 
cardinal of a subgroup used. 
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Each group member signs with his own private key (no shared secrets). 

He may sign on behalf on any subgroup of users that contains himself. 
There is no central authority. 

The user within the group that signs is anonymous (inside the group). 
Security is based on the NP-hard problem MinRank. 

At any moment we may introduce a new user and remove a user. 

Selective repudiation of signatures: introducing a new user U' and invalidat- 
ing his public key can be used as a mean to repudiate all signatures made 
with this user included in the subgroup. The repudiation is controlled by the 
person who knows the secret key of U' and publishes it. 




Responsive Round Complexity and Concurrent 

Zero-Knowledge 



Tzafrir Cohen^, Joe Kilian^, and Erez Petrank^ 

^ Dept, of Computer Science, Technion - Israel Institute of Technology, Haifa 32000, 
Israel, {tzafrir I erezjOcs .technion. ac . il 
^ Yianilos Labs, j oeOpnylab . com 



Abstract. The number of communication rounds is a classic complexity 
measure for protocols; reducing round complexity is a major goal in pro- 
tocol design. However, when the communication time is inconstant, and 
in particular, when one of the parties intentionally delays its messages, 
the round complexity measure may become meaningless. For example, 
if one of the rounds takes longer than the rest of the protocol, then it 
does not matter if the round complexity is bounded by a constant or 
by a polynomial. In this paper, we propose a complexity measure called 
responsive round eomplexity. Loosely speaking, a protocol has responsive 
round complexity m with respect to Party A, if it makes the following 
guarantee. If A’s longest delay in responding to a message in a run of the 
protocol is t, then, in that run, the overall communication time is at most 
m • t. The logic behind this dehnition is that if a party responds quickly 
to a message, whether it has a good connection or it just chooses not 
to delay its messages, then this party deserves to get an overall quicker 
running time. Responsive round complexity is particularly interesting in 
a setting where a party may gain something by delaying its messages. In 
this case, the delaying party does not deserve the same response time as 
another party that behaves nicely. 

We demonstrate the significance of responsive round complexity by 
presenting a new protocol for concurrent zero-knowledge. The new 
protocol is a black-box concurrent zero knowledge proof for all languages 
in NP with round complexity 0(log^ n) but responsive round complexity 
O(logn). While the round complexity of the new protocol is similar to 
what is known from previous works, its responsive round complexity is a 
significant improvement: all known concurrent zero-knowledge protocols 
require 0(log^ n) rounds. Furthermore, in light of the known lower 
bounds, the responsive round complexity of this protocol is basically 
optimal. 

Keywords: Zero-knowledge, concurrent zero-knowledge, cryptographic 
protocols. 



1 Introduction 

In this work, we study a new measure related to the round complexity of proto- 
cols. We propose a notion of responsive round complexity that properly relates 
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the running time of the protocol with the response time of each of the parties. 
Finally, we show how to improve state-of-the-art concurrent zero-knowledge pro- 
tocols with respect to their responsive round complexity, and obtain an (almost) 
optimal protocol with respect to its responsive round complexity. 



1.1 Round Complexity 

The number of rounds in a run of a protocol can be a major time-consuming 
component. Therefore, round complexity is one of the important complexity 
measures of protocols. However, a protocol’s round complexity is not always 
directly proportional to its time complexity. The reason is that communication 
rounds do not always have the same length. Thus, for example, if the length of 
one of the rounds exceeds the accumulative length of all the other rounds, the 
round complexity does not tell us anything about the time complexity. 

The difference in the length of communication rounds may be a result of two 
different reasons. One is that the network is unstable and communication times 
vary during the run of the protocol. The second possible reason is that one of 
the parties may delay its messages. It is sometimes useful for a party to delay 
its answer in the protocol until something happens. For example, it may delay 
its answer until it obtains information from another source, or it may try to foil 
timing assumptions made by other parties in the protocol. 

We propose a new complexity measure called responsive round complexity. 
Our intention is to relate the overall running time of a party to its response time. 
By this measure, each party gets a guarantee on the overall communication time, 
which relates to the longest delay it imposes on the run of the protocol. A party 
that always responds quickly gets a good guarantee on the overall communication 
time, and a party that sometimes responds slowly gets a poor guarantee on the 
overall communication time. 

In this extended abstract we concentrate on the two-party case. An extension 
of the definition to multi-party protocols is straightforward. 

Definition 1.1. Response time: We say that the response time of a party A 
in round i of a specific run a of a protocol U is t, if U in run a tells A to send 
a message in round i, and t is the length of the time interval starting from the 
time B sent its message in round i — 1 and ending at the time B received A ’s 
response of round i. (If A is not supposed to send a message in round i, then its 
response time is 0 for round i.) The response time of A in run a of protocol II 
is the maximum over all rounds i of A ’s response time in round i. 



Definition 1.2. Responsive time complexity: We say that a protocol II has 
responsive round complexity m with respect to Party A, if for any possible run 
a of Protocol II, the overall communication time does not exceed t ■ m where t 
is A’s response time in Run a of Protocol II. 
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Note that if all rounds are equally long in all runs of the protocol, then the 
responsive round complexity measure with respect to each of the two parties 
equals the (standard) round complexity measure^. 

Our primary interest in this notion is for cases when a party actually delays its 
messages to gain something; our goal is to develop protocols in which purposeful 
delays merely punishes the delayer. However, we note that the guarantee of our 
protocol also holds for networks with unstable or heterogeneous communication 
links. Parties will be (unfairly) punished for network delays beyond their control, 
but this punishment will be roughly proportional to the inherent delays. Thus, 
someone with a slow connection will obtain slow service, but will not be starved, 
as would be the case if a protocol simply timed-out on slow participants. 

We demonstrate the usefulness of the new notion by using it for analyzing 
concurrent zero-knowledge protocols and constructing a new protocol with an 
(almost) optimal responsive round complexity. 



1.2 Concurrent Zero Knowledge 

Zero-knowledge interactive proofs as presented by Goldwasser, Micali, and Rack- 
off [16] are proofs that yield no knowledge but the validity of the proven assertion. 
These proof systems have proven important tools for a variety of cryptographic 
applications. However, the original definition of zero-knowledge considers secu- 
rity only in a restricted scenario in which the prover and the verifier execute the 
proof disconnected from the rest of the computing environment. 

In recent years, several papers have studied the affect of a modern computing 
environment on the security of zero-knowledge. In particular, many computers 
today are connected through networks in which connections are maintained in 
parallel asynchronous sessions. It would be common to find several connections 
(such as FTP, Telnet, an internet browser, etc.) running together on a single 
workstation. Can zero-knowledge protocols be trusted in such an environment? 

Zero-knowledge in a concurrent environment was first explored by Feige [12], 
and by Dwork, Naor, and Sahai [10]. Dwork, Naor and Sahai denoted zero- 
knowledge protocols that are robust to asynchronous composition concurrent 
zero-knowledge protocols. They observed that several known zero- knowledge 
proofs, with a straightforward adaptation of their original simulation to the 
asynchronous environment, may cause the simulator to work exponential time. 
Thus, it seems that the zero-knowledge property does not necessarily carry over 
to the asynchronous setting. 

Kilian, Petrank, and Rackoff [19] gave the first lower bound for concurrent 
zero-knowledge, showing that any language that has a 4-round black-box con- 
current zero-knowledge interactive proof or argument is in BPP. Thus, a large 
class of known zero-knowledge interactive proofs and arguments for non-trivial 
languages do not remain zero-knowledge in an asynchronous environment. Rosen 

Here, we adopt the measure by which a round consists of two messages: one from 
Party A to Party B and the other is the response of B to A. 
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[25] has improved this lower bound from from 4 rounds to 7. Canetti, Kilian, Pe- 
trank and Rosen [5] have substantially improved the lower bound to fl{\ogk)? 
The parameter k is the security parameter. A polynomial in k bounds the length 
of the inputs, the number of proofs that may start concurrently, and the time 
complexity that the parties spend in the protocol. 

On the other hand, Richardson and Kilian [24] exhibited a concurrent zero- 
knowledge proof for any language in NP. Their protocol requires polynomially 
many rounds in k. Kilian and Petrank [18] substantially narrow the gap between 
the upper bound and the lower bounds. Using a different simulator, they provide 
a tighter security analysis for the Richardson-Kilian protocol, and show that it 
remains concurrent zero knowledge when run with only w(log^ k) rounds. 

How do these results translate to responsive round complexity? Zero-know- 
ledge is about providing security to the prover. Thus, we expect the prover to 
follow the protocol and not delay its answers. The verifier is the bad guy, who 
may choose to deviate from what the protocol dictates in order to get knowledge 
from the prover. Thus, the verifier may delay its answers, and we would like to 
investigate how protocols behave in this case. It seems fair to provide quicker 
service (overall communication time) to verifiers that respond quickly and do 
not delay their answers. Verifiers that do delay their answers may get an overall 
slower run of the protocol. Responsive round complexity guarantees that the 
overall communication time is proportional to the longest delay of the verifier. 

Looking at the best known upper bound protocol in [18], it is easy to see 
that the responsive round complexity with respect to the verifier is equivalent 
to its round complexity in stable networks. The verifier may simply keep its 
response time steady, and then the two measures equate. This is the best known 
protocol with respect to responsive time complexity, and it has responsive time 
complexity of any function m satisfying m = u;(log^ k). 

If we look at the best known lower bound in [5] , it provides a specific schedule 
such that if the protocol does not have enough rounds, no black box simulator 
can simulate it in this schedule. In the demonstrated schedule each verifier has 
its own response time, but each of the verifiers does not change its response 
time during the proof. Thus, the lower bound holds also for responsive round 
complexity, and we cannot do better than l7(logfc). 

In this paper we present a new concurrent zero-knowledge proof for all 
languages in NP that has responsive round complexity m for any function 
m = Lu{logk). Namely, the responsive round complexity of this protocol can 
be set to any function asymptotically larger than log k. Thus, we get an algo- 
rithm whose responsive round complexity is almost optimal (up to a factor of at 
most 0(log^ log fc)). Thus, any verifier that does not delay its messages (or even 
just does not change the delay from round to round) is guaranteed a round com- 
plexity of 0(logA:). Verifiers that do delay their messages get a protocol whose 
running time is at most O(logfc) times the longest delay they choose to use. 



^ The “twiddle” notation neglects multiplicative factors that are polylogarithmic in 
the main term. 
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In a recent breakthrough work, Barak [1] gives a concurrent zero-knowledge 
for NP which is not black-box and requires only a constant number of rounds. 
A slight drawback of this protocol is that the maximum number k of concurrent 
sessions tolerated must be predetermined in advance, and the communication 
required by this protocol is proportional to the chosen polynomial. Our protocol, 
like previous black-box protocols, is robust against any polynomial number of 
concurrent sessions, and its overall communication is independent of the number 
of sessions. 



1.3 Techniques 

One set of previous protocols [24,18] ignore the timing of the messages and con- 
sider only their order; black-box simulatable protocols exist in the general model, 
though with high round complexity. Another approach is taken by the protocols 
of [10,11]. In this approach, strong restrictions are enforced (or assumed) on 
the ratio between the slowest and longest response times, simplifying the task 
of producing a simulation, and allowing for constant-round protocols. One in- 
terpretation (and implementation) of this restriction is that verifiers with slow 
response times are treated as malicious; their responses are rejected. 

Our approach is intermediate between these two approaches. As with the 
latter approach, we do take response times into consideration, but as with the 
first approach we place no restriction on these delays. Instead, we monitor the 
delays and “punish” verifiers with long delays, though in a proportionate fashion. 
Each verifier has an associated response time that is doubled when the verifier 
does not respond in time. Thus, there are 0(log/c) sets of verifiers, each set 
containing verifiers responding at around the same time. The prover may delay 
its answers to each verifier to match its delays with those of the verifier. For each 
set, we use techniques similar to those in [10,11] to simulate the conversations 
with verifiers in this set. We then show that simulating the 0(log k) sets together 
is still doable in polynomial time. 



1.4 Contributions 

The first contribution of this work is in proposing the notion of responsive round 
complexity. We feel that this notion may be useful in settings when one of the 
parties may gain something from delaying its responses. A guarantee on the 
responsive round complexity provides a guarantee on the time complexity such 
that each party “gets what it deserves” . 

Our second contribution is in providing a concurrent black-box zero- 
knowledge protocol with almost optimal responsive round complexity. Our de- 
sign uses the protocol of [24,18] as a subroutine; its main technical contribution 
is a method for restarting this subroutine so as to obtain a better protocol in 
practice. 
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1.5 Related Work 

Our notion of responsive-round complexity is of course related to the vast lit- 
erature on distributed algorithms, and continues the program of studying zero- 
knowledge in distributed settings.^ We do point out the difference between our 
notion and the most commonly used distributed model. In the standard dis- 
tributed model, an adversary can speed up responses in worst case fashion; we 
require that all parties give a “correct” output by the end of the protocol. In 
our model, we impose additional requirement on when individual parties finish 
(give a final output); parties whose responses have been sped up may have to 
finish long before the end of the protocol as a whole. (Here, the “protocol” is 
the collective set of interactive proofs) 

Several recent works have overcome the difficulty of the asynchronous setting 
by putting limits on the asynchronisity of the system (timing assumptions) [10, 
11,6,9] or by making some set-up assumptions on the environment (such as a 
public key infrastructure) [7,4]. 

1.6 Terminology 

Some words on the terminology we are using. By zero-knowledge we mean 
computational zero-knowledge, i.e., the distribution output by the simulation 
is polynomial-time indistinguishable from the distribution of the views of the 
verifier in the original interaction. Our proof is black-box zero-knowledge. The 
proof will be perfectly sound, i.e., we will construct an interactive proof, yet it 
will be possible to run the prover in polynomial time given a witness to the NP 
assertion that the prover is making. 

1.7 Guide to the Paper 

In Section 2 we go over the preliminaries. We state our main result in Sect. 3. 
We provide an overview on the protocol and proof in Sect. 4. The protocol itself 
is presented in Sect. 5, the simulator to the protocol is presented in Sect. 6, and 
the analysis of the simulator is given in Sect. 7. 

2 Preliminaries 

2.1 Zero-Knowledge Proofs 

Let us recall the concept of interactive proofs, as presented by [16]. For formal 
definitions and motivating discussions the reader is referred to [16]. 

Definition 2.1. A protocol between a (computationally unbounded) prover P 
and a (probabilistic polynomial-time) verifier V constitutes an interactive proof 
for a language L if there exists a negligible function e such that 

® Indeed, we would not be surprised if quite similar definitions have been proposed in 
this literature. 
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— Completeness: //x G L i/ien Pr [(P, y)(x) accepts ] > 1 — e(|x|) . 

— Soundness: If x ^ L then for any prover P* 

Pr [(P*, P)(x) accepts ] < e(|x|) . 

Brassard, Chaum, and Crepeau [2] introduced a modification of interactive 
proofs, called arguments, in which the prover is also polynomial time bounded. 
Thus, the soundness property is modified to be guaranteed only for probabilistic 
polynomial time provers P*. 

Let {P,V){x) denote the random variable that represents P’s view of the 
interaction with P on common input x. The view contains the verifier’s random 
tape as well as the sequence of messages exchanged between the parties. 

We briefly recall the definition of black-box zero-knowledge [16,23,15,17]. The 
reader is referred to [17] for more details and motivation. 

Definition 2.2. A protocol (P,V) is computational zero-knowledge (resp., sta- 
tistical zero-knowledgej over a language L, if there exists an oracle polynomial 
time machine S (simulator) such that for any polynomial time verifier V* and 
for every x G L, the distribution of the random variable (x) is polynomially 
indistinguishable from the distribution of the random variable {P,V*){x) (resp., 
the statistical difference between M{x) and {P,V){x) is a negligible function in 
\x\). 

In this paper, we concentrate on black-box computational zero-knowledge, and 
use zero-knowledge as shorthand for black-box computational zero-knowledge. 

2.2 Bit Commitments 

We include a short and informal presentation of commitment schemes. For more 
details and motivation, see [22]. A commitment scheme involves two parties: 
The sender and the receiver. These two parties are involved in a protocol which 
contains two phases. In the first phase the sender commits to a bit, and in the 
second phase it reveals it. A useful intuition to keep in mind is the “envelope 
implementation” of bit commitment. In this implementation, the sender writes 
a bit on a piece of paper, puts it in an envelope and gives the envelope to 
the receiver. In a second (later) phase, the reveal phase, the receiver opens the 
envelope to discover the bit that was committed on. In the actual digital protocol, 
we cannot use envelopes, but the goal of the cryptographic machinery used, is 
to simulate this process. 

More formally, a commitment scheme consists of two phases. First comes 
the commit phase and then we have the reveal phase. We make two security 
requirements which (loosely speaking) are: 

Secrecy: At the end of the commit phase, the receiver has no knowledge about 
the value committed upon. 

Binding property: It is infeasible for the sender to pass the commit phase suc- 
cessfully and still have two different values which it may reveal successfully 
in the reveal phase. 
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Various implementations of commitment schemes are known, each has its ad- 
vantages in terms of security (i.e., binding for the receiver and secrecy for the 
receiver), the assumed power of the two parties etc. 

Two-round commitment schemes with perfect secrecy can be constructed 
from any collection of claw-free permutations; see [22]. It is shown in [2] how 
to commit to bits with statistical security, based on the intractability of cer- 
tain number-theoretic problems. Damgard, Pedersen and Pfitzmann [8] give a 
protocol for efficiently committing to and revealing strings of bits with statisti- 
cal security, relying only on the existence of collision-intractable hash functions. 
This scheme is quite practical and we adopt it for the verifiers in our protocol. 
For the prover, we use a commitment scheme whose binding is information the- 
oretic and security is computational. Such schemes can be constructed from any 
one-way function, see [20] . For simplicity, we simply speak of committing to and 
revealing bits when referring to the protocols of [8] for the verifier and [20] for 
the prover. We will need to use the properties of the commitment schemes in 
the concurrent setting. 

Theorem 2.3. The security of the bit commitments in [20] and [21] holds also 
in the concurrent setting. 

Proof. By definition, the binding property must be robust to asynchronous com- 
position. Otherwise, the committer may play a mental game in which his real 
stand-alone commitment is part of an asynchronous game which he simulates, 
and then defeat the binding property in the normal stand-alone world. 

As for the secrecy, a similar argument may be more complicated, since the 
receiver cannot simulate the behavior of the committer. Specifically, the com- 
mitter has some information that the receiver does not have: the value of the 
committed string, which may be used in the other commitments. However, in 
our proof, the committer commits on uniformly chosen random strings. (And on 
nothing else.) Thus, if the committer follows the protocol, then the receiver is 
able to simulate the rest of the environment and the above argument holds for 
secrecy as well. □ 



2.3 Witness Indistinguishability 

Witness indistinguishable proofs were presented in [13]. The motivation was to 
provide a cryptographic mechanism whose notion of security is similar though 
weaker than zero-knowledge, it is meaningful and useful for cryptographic pro- 
tocols, and the security is preserved in an asynchronous composition. A witness 
indistinguishable proof is a proof for a language in NP such that the prover is 
using some witness to convince the verifier that the input is in the language, 
yet, the view of the verifier in case the prover uses witness Wi or witness W2 
is polynomial time indistinguishable. Thus, the verifier gets no knowledge on 
which witness was used in the proof. The formal definition follows. For further 
discussion and motivation the reader is referred to [13]. 
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2.4 Black-Box Simulation 

The initial definition of zero-knowledge [17] requires that for any probabilistic 
polynomial time verifier V, a simulator Sy exists that simulates y’s view. Oren 
[23] proposes a seemingly stronger, “better behaved” notion of zero-knowledge, 
known as black-hox zero-knowledge. The basic idea behind black box zero-know- 
ledge is that instead of having a new simulator Sy for each possible verifier, we 
have a single probabilistic polynomial time simulator S that interacts with each 
possible V. Furthermore, S is not allowed to examine the internals of V, but must 
simply look at 17’s input /output behavior. That is, it can have conversations 
with V and use these conversations to generate a simulation of F’s view that is 
computationally indistinguishable from V’s view of its interaction with P. 

For further definitions and motivations the reader is referred to [23] 

2.5 Concurrent Zero- Knowledge 

Following [10], we consider a setting in which a polynomial time adversary con- 
trols many verifiers simultaneously. The adversary A takes as input a partial 
conversation transcript of a prover interacting with several verifiers concurrently, 
where the transcript includes the local times on the prover’s clock when each 
message was sent or received by the prover. The output of A will be a tuples of 
the form (V, a, t), indicating that P receives message a from a verifier V at time 
t on P's local clock. The adversary may either output a new tuple as above, or 
wait for P to output its next message to one of the verifiers. The time that is 
written by the adversary in the tuple, must be greater than all times previously 
used in the system (by messages sent to P or by P). The view of the adversary 
on input x in such an interaction (including all messages and times, and the 
verifiers random tapes) is denoted (P,A){x). 

Definition 2.4. We say that a proof or argument system (P, V) for a language 
L is (computational) concurrent zero-knowledge if there exists a probabilistic poly- 
nomial time oracle machine S (the simulator) such that for any probabilistic 
polynomial time adversary A, the distributions (P,A){x) and S^{x) are compu- 
tational indistinguishable over the strings that belong to the language L. 

In what follows, we will usually refer to the adversary A as the adversarial 
verifier V* or just the verifier V* . All these terms mean the same. 

In our setting, the simulator will simulate a predetermined time interval 
which is polynomial in k. We assume that while rewinding the verifier, the sim- 
ulator may also set its clock to the required rewound time. 

2.6 The Complexity Parameters 

In this paper, we simplify the discussion by using a single security parameter 
k. Our proof has (in worst case) w(log^ k) rounds and it has responsive round 
complexity w(logA:). The zero-knowledge simulation is guaranteed for a polyno- 
mial (in k) number of concurrent proofs. Also, the running time of the protocol 
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is polynomial in k. We will measure time by the smallest time units that are 
relevant in this setting. For example, one may think of the time unit as the 
minimal time a round in the protocol may take. But we may also use a much 
smaller time unit: the time of a computer cycle. In any of these time units, it 
holds that the running time of the protocol is polynomial (in k). 

3 Main Result 

Our main result is the existence of black-box concurrent zero-knowledge inter- 
active proof for all languages in NP with responsive round complexity m for any 
m satisfying m = w(logfc). We state this explicitly in the following theorem. 

Theorem 3.1. Assume there exist secure two-round commitment schemes with 
statistical secrecy and secure two-round commitment schemes with statistical 
binding (such schemes follow from the existence of a family claw-free permu- 
tation pairs). Let k he a complexity parameter bounding the size of the input. 
The verifier is polynomial time in k, and the concurrent proof may contain a 
polynomial (ink) number of proofs concurrently. Then there exists a black-box 
concurrent zero-knowledge interactive proof for all languages in NP, with: 

— responsive round complexity m{k), for any function m{k) satisfying m{k) = 
w(logA:), and 

— a worst case round complexity of m{k) ■ logfc. 

4 Overview of Protocol and Proof 

We start with the protocol in [24,18]. We choose the following parameters for 
this protocol: the preamble consists of m rounds for m = u;(logA:) (recall that 
a round consists of a message sent from the prover to the verifier followed by 
a response of the verifier). The body of the proof consists of a low error, con- 
stant round, auxiliary-input witness-indistinguishable interactive proof for NP 
in which the prover can be efficient given the witness to the proven assertion. 
The zero-knowledge protocol of [14] will do. 

When a new copy of the protocol is initiated by the verifier, the verifier in the 
new protocol is associated with a response time which is initially the minimal 
possible response time, say the time of a computer cycle. When the verifier fails 
to respond within this time, the time associated with this verifier is doubled 
and the verifier is notified that it must start again with the doubled time. In 
this case we say that the verifier has been reset and has gone one level up. This 
may happen at most 0(log k) times since at some such level the response time 
becomes greater than the running time of the adversary, or bigger than the time 
interval that has to be simulated. The verifiers may be viewed as working in 
levels of responsiveness. Level i contains all verifiers with response time at most 
)3i = 2* and greater than (3i-i. The prover treats each verifier independently in 
light of its associated response time or level. For each verifier, the prover delays 
its answer according to its associated delay /3 in a manner yet to be discussed. 
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The completeness and soundness of the interactive proof hold as in [24,18]. 
The worst case number of rounds for this protocol happens when the verifier 
goes through m steps in each level and then delays its last message and is reset 
while going up to the next level. This yields m ■ 0(log k) rounds in the worst 
case. But since each level takes double the time of the previous level, the overall 
interaction time is dominated by the time of the highest level interaction, and 
is 0{m). 

It remains to prove that the interactive proof is zero-knowledge. The delays 
imposed by the prover are similar to those suggested in [10]. Thus, simulating 
all protocols at the same level becomes possible in a way similar to that in [10]. 
In fact, these verifiers may be viewed as adhering to timing constraints. The 
delay imposed by the prover are not more than twice Pi for a verifier in level i 
and thus, do not increase the protocol time too much. It remains to show that 
rewinding protocols at higher levels do not force too many rewinding at lower 
levels. This is obtained with some care in the setting of the prover delays and 
by the fact that there are at most a logarithmic number of levels. 

The reason we need a logarithmic number of rounds and cannot do with a 
constant number of rounds for each level as in [10] is the relation between the 
various levels. We allow ourselves one rewind only to any interval we wish to 
rewind. Any other constant will do, but rewinding a super-constant number of 
times (or polynomial as in [10]) will make the overall simulation time super- 
polynomial. Note also that this is an inherent problem since the lower bound 
in [5] uses verifiers that in each specific copy of the proof do not modify their 
response time. Thus the lower bound holds also for responsive round complexity 
and we cannot do with asymptotically less than log fc/ log log(A:) responsive round 
complexity. 



5 The Zero-Knowledge Protocol 

We start by presenting the protocol. It consists of a preamble of m rounds 
where m is any function satisfying m = w(logA:) and a body consisting of a (not 
concurrent) constant round zero-knowledge proof. If this were the full picture, 
we would get that the overall number of rounds is dominated by m and is thus 
almost logarithmic. However, we sometimes let the prover say “RESET”. This 
happens only during the preamble, and is caused by a long delay in the verifier 
response. When such a delay occurs, the protocol starts from the beginning with 
a delay parameter doubled. At this point we say the the proof has gone up one 
level. Generally a proof is at level i if it has gone through i resets. 

To see that the overall round complexity is m ■ 0(log k) it is enough to note 
that the maximum number of resets is logarithmic. This is true since the delay 
can only be doubled a logarithmic number of times. The logarithm is in the length 
of the simulated period. We denote this length by A and measure it in units of 
Po, i.e., the time of a computer cycle. In Figure I we describe the protocol. This 
is the protocol presented in [24,18] enhanced with time monitoring and possible 
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Step V-0: 


V ^P: V Selects m strings, vi,. . . ,Vm € {0, 1}" uniformly and inde- 
pendently at random, and send Commit (ui) , . . . , Commit (vm) to the 




prover. 


Step P-1: 


P ^V: Send Commit (pi) exactly T after the Step V-0 message was 
sent. 


Step V-1: 


V ^P: Reveal vi. 


Step P-j: 


P ^V: If V’s message from Step V-(j — 1) was received more than T 
time units after P’s message from Step P-(j' — 1) was sent then goto 
RESET. Else, send Commit (pj) exactly 2T time units after P’s round 
(ji — 1) message was sent. 


Step V-j: 


V ^P: Reveal Vj. 


Step P-m: 


V ^P: If V’s message from Step V-(m — 1) was received more than T 
time units after P’s message from Step P-(m — 1) was sent then goto 
RESET. Else, send Commit {pj) exactly 2T time units after P’s round 

U - 1). 


Step V-m: 


V ^P: Reveal Vm. 


Proof body: 


P waits T time units and then proves to V in zero-knowledge that x G L 
or that 3i, 1 < i < m, such that pi = Vi. 

(No delays or time monitoring is used during the course of this proof.) 


End of proof 


RESET: 
P ^V: 


A reset message with parameter 2T. Both P and V continue by setting 
T = 2T and starting the protocol from Step V-0 again. 



Fig. 1. The protocol 

resets. All commitments from the verifier to the prover are statistically secret 
and all commitments from the prover to the verifier are statistically binding. 

Theorem 5.1. If the zero-knowledge proof used in the body of the protocol has 
completeness error Cc and soundness error then our interactive proof as in 
Fig. 1 has completeness error Cc and soundness error at most Eg s for some 
negligible fraction s. 

Proof. Clearly, the completeness error cannot increase. As for the soundness, the 
prover may gain extra strength by managing to set Pi = Vi for some 1 < i < m. 
However, since the verifier is using statistically hiding commitment scheme this 
may happen with negligible probability only, and we are done. □ 

Lemma 5.2. The protocol has responsive round complexity 5m. 

Proof. We show that it holds for the preamble. The additional constant num- 
ber of rounds in the body of the proof cannot increase the responsive round 
complexity since the prover answers with no delays at that stage of the protocol. 

Consider a proof that ended the preamble at level £, i.e., had £ resets. (We 
will discuss later the case that the preamble has not ended at all within the time 
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A.) If £ = 0 then the proof had m rounds, the length of each equals the minimum 
possible response time, that was actually matched by the verifier. Otherwise, we 
have £ yf 0. The proof was last reset at level £ — 1 which means that the verifier 
did not respond within time Pe-i = Thus the response-time of the 

verifier is at least We now compute the overall communication time and 

show that it is smaller than 5m • (3e- 1 . 

At each of the levels i = 1,2, ...,£— 1 the protocol ran for at most m 
rounds. At level £ we assume it finished the preamble and thus had m rounds. 
Summing over all the communication times during the preamble we get that the 
communication time is bounded by 

i t 

=m-^2* < m • 2*+^ = 4 • m • /3^_i . 

i=l i=l 

We bound the additional communication time of the proof body by This 

is correct for the constant round body if the verifier does not pose a delay longer 
than if it does, the responsive round complexity may only decrease. 

Last, we deal with the case that the verifier does not finish. We assume that 
the simulation time A is much larger than the running time of the adversarial 
verifier. Thus, a particular verifier that has not yet responded will never respond 
and its responsive round complexity is much better than 5m. □ 

We next show that the protocol is concurrent zero-knowledge, by presenting 
a simulator for the concurrent interaction. 



6 The Simulator 

We present a black box simulation of the above protocol. We assume the worst, 
i.e., that there is one adversary that controls all verifiers (whose number is poly- 
nomial in k). This adversary deviates from the protocol as it wishes and is 
limited only by being a polynomial time machine. The simulator interacts with 
this adversary (or with these verifiers) and its goal is to produce a transcript 
distribution which is indistinguishable from the real interaction between the ad- 
versary and the original prover P. Note that each message in the transcript is 
associated with a time telling when it is produced after the beginning of the 
interaction. 

The simulator simulates the body of the proof simply by playing the real 
prover. The reason it may do that is that it rewinds each of the verifiers so 
that it manages to get a round i in which Pi = Vi. After that we say that this 
particular copy of the proof has been “solved”, or that this particular verifier 
has been neutralized. Our goal is to ensure that there will be enough rewinding 
so that all proofs will be solved, while taking care that the rewinding does not 
exceed polynomial time. 

The difficulty in the construction and in describing the simulator lies in the 
rewinding schedule. Other than that the operation of the simulator is quite 
simple. The simulator runs the adversary on a randomly chosen random string 
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while performing all rewinds in the rewind schedule. The simulator breaks the 
entire sequence of time steps into sections. Each section is simulated twice by 
the simulator. The first run is used to obtain information, and the second run 
is used to produce the actual output transcript. During simulation of each such 
section, the simulator recursively divides it into smaller subsections. 

During the first time a section is simulated, the simulator records the strings 
revealed by the verifiers during this run. Then, while running the second run 
of the rewind, the simulator solves all proofs that may be solved by setting pi 
to equal Vi for known values of Vi’s. The second run of the section is used to 
produce the transcript obtained thus far. When a body of a proof arrives, if the 
proof has been solved, then the simulator acts as the prover while proving the 
existence of i such that pi = Vi (the simulator has a witness to this fact). If the 
proof has not been solved, the simulation aborts and declares failure. 

We will show that the probability that any of the proofs remains unsolved 
is negligible. Thus, the simulator rarely fails. When it does not fail, its output 
will be indistinguishable from the real interaction. One difference between the 
simulated transcripts and the real ones is in the preambles: in the simulation 
there is an i with pi = Vi. But by the secrecy of the commitment schemes 
this difference cannot be detected by a polynomial-time bounded machine. Note 
that these strings are never revealed, avoiding difficulties arising when partial 
subsets are revealed. The second difference is in the witness used in the bodies 
of the proofs. However, a zero-knowledge proof is witness indistinguishable. This 
property is preserved in a concurrent setting and is thus indistinguishable by a 
polynomial time distinguisher. 

It remains to show that there exists a rewinding schedule by which the sim- 
ulator is efficient and still all proofs are solved with overwhelming probability. 

6.1 The Rewinding Schedule 

The schedule of the rewinds is given as a pseudo-code in Fig. 2 and is illustrated 
in Fig. 3. 

The X axis represents the time (as viewed by the verifiers or listed in the 
output transcript produced by the simulator), and the numbers in the graph 
represent the X coordinate (=the time) of an event. The Y axis represents the 
order of events of the simulator itself. The advances of the simulation are shown 
as thick arrows, whereas the rewinds are shown as thin backward arrow. 

In the example of Fig. 3 the top-level run has exactly two recursive sections. 
At the top level this is not always the case, but in any other level the recursion 
is invoked exactly twice. The top level of this run is logZ\ (all logarithms are 
base 2), where A is the length of the interval we simulate. In the example Z\ = 4 
and the top level is 2. The first section starts at the beginning and ends when 
the simulator advances from 1 to 2 after the after the seventh rewind (the second 
(1 ^ 3)). The second section begins in the advancement from 2 to 4 and ends 
at the end of the simulation. 

During the run of the top-level sections there are also rewinds of lower levels. 
In this example there is only one lower level: level 1. For each rewind of level 
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0: // Recall that A is the overall simulation time interval. 

1: topJevel = log(Z\) 

2 : 

3: // This is a recursive algorithm. Top-level call follows: 

4: simuliQ, A, topJevel) 

5: output transcript 
6 : 

7: // Definition of recursive function: 

8: simw/ (location, length, level) 

9: P = 

10: if (level < 1) 

11: // here comes the simulation of interaction with V* for time interval length 

12: return 

13: else 

14: // recursively run lower-level simulations 

15: for i = 0 to - 1 

16: simw/ (location + i-/3, /?, level-1) 

17: simtt/ (location -|- (i+l)-/l, /?, level-1) 

18: rewind_toJocation{\ocation -|- i-/3) 

19: simii/ (location -|- i-/l, /?, level-1) 

20: end for 

21: end if 



Fig. 2. Description of the Rewinding Schedule 

£ there are 6 rewinds of level £ — 1 (4 before the actual level-£ rewind, and 
two after it). Thus before the first level-2 rewind ((0 ^ 4)) there are 4 level-1 
rewinds ((0^2), (1^3), (2^4) and (3^5)) and more two after it ((0 ^ 2) 
and (1 ^ 3)). 

Note that the rewinding schedule does not depend on the schedule of proofs 
as determined by the adversarial verifier. It may be the case that no proof ran 
and the simulator would still behave the same. The rewinding schedule depends 
on the time only."^ 



6.2 The Effects of the Rewinds 

Each proof may start its i-th level run in an arbitrary point in time. However, 
by the delay of Pi imposed by the prover, they all have the same look during the 
preamble: The prover sends a message, then the verifier responds within time 
Pi and then the prover sends its next message exactly 2Pi time units after its 

^ We remark that one may obtain better efficiency by checking if the rewind is helpful 
to the simulation and avoid rewinding it it’s not. Even if one does not try to check 
the messages, a scrutiny of the schedule may lead to other improvements. However 
all we care about is that the simulation is polynomial time and this is guaranteed 
by our simple non-optimized simulation procedure. 
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Fig. 3. The rewind schedule of two rounds of level 2 



previous message. Thus, the time between two prover messages is always 2/3j 
and the response time of the verifier is less than (3i. 

A rewind operation at level i that makes the simulator run the interval (T, T -|- 
2(3i) twice is meant to solve all proofs of level i in which the verifier sent a 
preamble message “Reveal vi' (for some 1 < f < m) in response to a prover 
message that was sent in-between the times T and T + (3i. Note that in this 
case the verifier must respond before T + 2(3i and thus simulator has learned 
the value of v^. Since is still within the rewind. The simulator may modify 
the commitment on in the second run of the rewind and commit on p£ = Vf. 
For example, the rewind (0 — 4) may solve proofs that run in the top level and 
whose prover has sent a message between the times 0 and 2. 

Had this always worked, we wouldn’t need so many preamble rounds. A 
couple of them would have been enough. However, here is what may go wrong: 
the verifier may delay its answer in the first run of the rewind interval, thus 
getting a reset message from the prover (simulator), yet, in the second run, 
provide an answer in time. In this case, the simulator would not know the value 
of Vi in the second run since it was not exposed in the first run. The second 
run is the one that prevails and written to the final transcript. Thus, solving the 
proof in this round of the preamble fails in this case. In Sect. 7 below, we argue 
that this happens with constant probability in each round and with negligible 
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probability in all the m rounds of the preamble. Note that setting pi = vi in any 
one of the rounds suffices to solve the proof. 

It remains to analyze the probability that the simulator succeeds in solv- 
ing each of the proofs (before getting to the proof body) and to verify that 
the rewinding schedule results in a polynomial time simulator. This analysis is 
provided in the following section. 

7 Analysis of the Simulator 

7.1 Efficiency 

We start by showing that the rewinding schedule of Section 6.1 results in a 
polynomial time simulator. Note that each step of the simulation, i.e., commit- 
ting on strings, revealing them, and playing the prover in the proof-body are 
all polynomial time. Thus, if the rewinding is polynomial time, we get that the 
whole algorithm is efficient. We will actually show that the number of rewinds 
is polynomial. Since each rewind time is polynomial this is enough. 

Lemma 7.1. The overall number of rewinds during the simulation run on time 
interval A is at most A^. 

Proof. We use the recursive description of the rewinding schedule as in Figure 2. 
Consider a run of a time interval t at level i. Using the notation of Figure 2 this 
is a run of simul{location, t,£). Note that the number of rewinds is independent 
of the location. Thus, we denote the number of rewinds in this run by X(t,£). 
In a run of t time units at level £ there are iterations of the main loop of 
the simul procedure. In each iteration there is a level-£ rewind and 3 calls to 
simul{-, l3i,£ — 1) are performed (recall that (3i = 2^). Thus, 

X{t,£) < -^■(l + 3-X{2Pi_i,£-l)) . 

Pi 

This recursion inequality gives the bound: X{t,£) < ^ . At the top level, 

t = A,£ = log A, and we obtain 

. 7^-1 _ ^ . ylogZi _ ylogZi ^ A 3 

Pi 2i°g^ - ’ 



as required. □ 

7.2 Indistinguishability 

We show that no polynomial-time algorithm can distinguish the output of the 
simulator from U*’s view of its interaction with the original prover, P. 

Assume, first, that the simulator always manages to solve all proofs before 
getting to the bodies of the proofs. We show later that this assumption holds 
with overwhelming probability. 
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We separate the discussion of the preambles and the proof bodies. The dif- 
ference in the preambles is that in the simulation one or more of the rounds 
has Pi = Vi- In the real interaction, this seldom happens. The difference in the 
bodies is that the simulator always proves that “3i such that pi = u/’ whereas 
the original prover (almost) always proves that x € L. 

The prover never opens its commitments on the pi’s. By the secrecy of the 
commitment scheme, a polynomial time distinguisher cannot tell between pream- 
bles generated by the simulators and real preambles. Since this is the case, the 
adversarial verifier itself cannot distinguish between the first and the second runs 
of a rewind. We will use this fact to show that the simulator solves all proofs 
with high probability. 

Finally, the proof bodies are witness indistinguishable. By [13] this property 
holds also in the concurrent setting. Thus, an efficient distinguisher cannot tell 
between using a witness to “3i such that pi = vf' and using a witness to “x G i” 
and we are done. 

It remains to show that the simulator may fail to solve one of the proofs 
only with negligible probability. We first argue (in Claim 7.1) that for each 
proof, each round of its preamble that appears in its final level is rewound. 
We then argue (in Claim 7.2) that a rewind of such a round does not solve the 
proof with probability at most 1/3. Since all rewinds are rewound independently, 
and since solving the proof in one of them is enough, and since there are m = 
w(log k) such rewinds before the body of the proof, we get that the proof remains 
unsolved with probability (1/3)™, which is negligible. Any of the proofs may be 
run a polynomial number of times by the simulator (since intervals are rewound) 
and there are a polynomial number of proofs. By the summation bound, the 
probability that any of these proofs is not solved by the end of the preamble 
remains negligible. 

Claim 7.1. If a proof preamble terminates at level!, then each of its m rounds 
at level I is rewound. 

Proof Sketch: By the delays posed by the prover, each round takes exactly 2(3i 
time units. For a rewind (T <— T + 2j3i) to properly rewind a proof round, both 
the prover’s message and the verifier’s message have to be within the interval 
(T,T + 2/3^). 

By the requirement of the f-th level, the answer of the verifier must arrive 
within j3(,. Thus if the prover has sent Commit{pi) at the interval (T, T + fit), it 
is guaranteed that the verifier’s reply (Reveal vf) will arrive within the rewind 
interval, and that the next prover message will be sent after the rewind interval 
ends. Thus the round will be properly rewound by a rewind {T <— T + 2j3i). 

It remains to show that any message that the prover sends on level ! has an 
associated level-f rewind. Details are omitted. □ 

Claim 7.2. If in a proof U at level £ the prover’s message for round i is sent 
at time T, then a rewind (T' ^ T' + where 0 < T' — T < Pe, solves II in 
this rewind with probability at least 2/3. 
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Proof. If the verifier answers in time during both runs of the rewind then the 
proof is solved: the first run reveals the value vj (for round j of the proof) and 
in the second run of the rewind the prover may commit on a modified Pj = Vj. 
Note that the verifier cannot modify Vj, except with negligible probability, since 
it is committed to this value as of the first round of the proof U. If the verifier 
does not answer in the second run of the rewind, then it is actually reset into 
level £ + 1 and this proof does not need to be solved at level £. The only bad 
case is when the verifier delays its answer in the first run, but does not delay it 
in the second run of the rewind. In this case, the simulator does not learn the 
value of Vj in the first run and thus, cannot set pj = Vj in the second run. 

What is the probability of this bad incident? Since the prover commits to 
Pj, the verifier cannot tell if pj = Vj, it cannot tell between the first and second 
run, except with negligible probability in which the secrecy of the commitment 
scheme fails. Suppose the verifier delays it’s message beyond with probability 
p at the first run. It then delays its message with probability at least p — e for 
some negligible fraction e in the second run. The probability that the simulator 
does not solve the proof is thus at most p- {1— p + e) < 1/4 + e < 1/3, and 
we are done. □ 
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Abstract. We give a careful, fixed-size parameter analysis of a stan- 
dard [1,4] way to form a pseudorandom generator by iterating a one-way 
function and then pseudo-random functions from said generator, [3]. We 
improve known bounds also asymptotically when many bits are output 
each iteration and we find all auxiliary parameters efficiently. The analy- 
sis is effective even for security parameters of sizes supported by typical 
block ciphers and hash functions. This enables us to construct very prac- 
tical pseudorandom generators with strong properties based on plausible 
assumptions. 



1 Introduction 

One of the most fundamental cryptographic primitives is the pseudo random gen- 
erator, a deterministic algorithm that expands a few truly random bits to long 
“random looking” strings. Having such implies (among other things) semanti- 
cally secure crypto systems, [5], secure key-generation for asymmetric cryptog- 
raphy etc. 

A sound theory of pseudo randomness did not emerge until the seminal works 
of Blum and Micali, [1], and Yao, [15]. Therefore, constructions in the early 80’s 
were still “ad-hoc”, and many of them later turned out to be completely in- 
secure. In a theoretical sense the area was closed when, in [6], it was shown 
that necessary and sufficient conditions for the existence of a pseudo-random 
generator is the existence of another fundamental primitive: the one-way func- 
tion; a function easy to compute, but hard to invert. We do not know if such 
functions exist, but many strong candidates exist, such as a good block cipher 
(mapping keys to cipher-texts, keeping the plaintext fixed), hash functions, etc. 
Still, the construction in [6] is complex, requiring key-sizes of millions of bits to 
give reasonable security guarantees, and an “ad-hoc” approach is still therefore 
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often used in practice. Thus, a construction with provable properties, useful in 
practice is highly desirable. 

The reason for the ineffectiveness of the theoretical constructions is that one- 
wayness is in itself not a strong property. A function may be hard to invert but 
still have very undesirable properties. For instance, even if / is one-way, most 
of X may still be easily deduced from f(x). Paradigms for generator construc- 
tion typically iterate /, and one-wayness may be lost in this process, etc. Thus, 
basing pseudo-randomness on one-wayness alone appears to require elaborate 
constructions. However, if one assumes only a little more than one-wayness, e.g. 
that the function / is also a permutation, the situation becomes much more 
favorable and reasonably practical constructions can be found from the work 
of Blum and Micali mentioned above, and later work by Goldreich and Levin 
[4]. In [1] it is shown that if / is a permutation and has at least a single bit of 
information, b(x), that does not leak via f(x), then a pseudo-random generator 
can be built. In [4], then, it is shown that every one-way function, in particular 
ones being permutations, have such a hard bit b(x). In this paper we make a 
careful analysis of this transformation from a one-way function to a pseudoran- 
dom generator, see Sect. 3. We add new elements of the analysis when we output 
m > 1 bits for each iteration of /, significantly improving the dependence on m. 
First, we (non-uniformly) reduce inversion of / to distinguishing the generator 
from randomness, given some auxiliary parameters. We then give efficient sam- 
pling procedures to determine the values of these parameters, giving a uniform 
inversion algorithm, see Sect. 3.1. Values of the parameters that give almost as 
strong results as the existential bounds can, for most parameter values, be found 
in time less than the time needed for successive inversions. 

A related primitive are the pseudo-random functions; functions that can not 
be distinguished from random functions on the same domain/range. Goldreich, 
Goldwasser, and Micali, [3], showed how such could be built from a pseudo ran- 
dom generator. In Sect. 3.2, we apply the same kind of fixed parameter analysis 
to their construction and use it to further enhance our generator. 

Our explicit theorems allow us to construct a generator that is efficient in 
practice based on the assumption that e.g. Rijndael (mapping keys to cipher- 
texts, fixing a plaintext) remains hard to invert even when iterated, see Sect. 4. 



2 Preliminaries 

2.1 Notation 

The length of binary string x is denoted |x|, and by {0, 1}" we denote the set of 
X such that \x\ = n. We write U„ for the uniform distribution on {0, 1}". Except 
otherwise noted, log refers to logarithm in base 2. 

Let G : {0, 1}" ^ {0, and let A be an algorithm with binary output. 
We say that A is a {L{n),T{n),5{n))-distinguisher for G, if A runs in time 
T(n) and | Pra,6t/„ [A(G(a;)) = 1] - V^:y^Un,^)[My) = l]l > (We call 5{n) 
the advantage of A.) If no such A exists, G is called {L{n), T{n), 5 {n))~ secure. 
Finally, recall that a function v{n) is negligible if for all c, v(n) G o(n~'^). 
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Our model of computation is slightly generous but realistic. We assume that 
simple operations like arithmetical operations and exclusive-ors on smalh size 
integers can be done in unit time. 



2.2 Pseudo-Random Generators from One-Way Permutations 

Suppose we have a one-way function, that in addition is a permutation. Further- 
more, suppose that we have a family of 0/ 1-functions, B = {bi}, bi{x) G {0, 1}, 
which are efficiently computable such that given f(x), bi(x) is computation- 
ally indistinguishable from a random 0/1 coin toss. Note that one-wayness of 
/ is necessary since otherwise hi(x) can be computed by first inverting /. We 
then say that B is a (family of) hard-core functions for /. The following con- 
struction, due to Blum and Micali [1], now shows how to construct a pseudo- 
random generator (PRG): choose xq (the seed), let Xi+i = f{xi), then output 
g{xo) = bi{xi), 62(0:2 ), ... as the generator output. 

Theorem (Blum-Micali, ’8f). Suppose there is an efficient algorithm D that 
distinguishes (with non-negligihle advantage) g{x) from a completely random 
string. Then, there is an efficient algorithm P and an i such that given f{x), P 
predicts bi{x) with non-negligible advantage. 

Due to the iterative construction, / must not loose one-wayness under iteration. 
This can be guaranteed if / is a permutation, or, heuristically if / is randomly 
chosen, see Theorem 1. Assumptions along these lines have been proposed by 
Levin in [8] and were in fact the first conditions to be proved to be both necessary 
and sufficient for the existence of pseudorandom generators. 

This leaves us with one question: which one-way functions (if any) have hard- 
cores, and if so, what do these hard-cores look like? 



2.3 A Hard-Core for Any One-Way Function 

A fixed 0/ 1-function, 6, can never be a general hard-core that works for every 
one-way function: given a one-way function /, the one-way function f'{x) = 
f{x),b{x) provides a counter example. In 1989, Goldreich and Levin [4] proved, 
by introducing extra randomness, that any one-way function can be modified 
to have hard-cores.^ Perhaps surprisingly, the hard-cores they found are also 
extremely simple to describe. If r, x are binary strings of length n, let (and 
Xi) denote the zth bit of r (and x), fixing an order left-to-right, or right-to-left. 
Let B = {br{x) I r G {0, 1}”} where 

br{x) = (r, x )2 = ri ■ xi r 2 ■ X 2 -\ + • a;„ mod 2, 

that is, the inner product mod 2. 

^ We need words of size n where n is size of the input on which we apply our one-way 
function, e.g. n — 128 or 256 for a typical block cipher. 

^ We again stress that this does not automatically imply that a PRG can be built 
from any one-way function, as the construction by Blum and Micali only works for 
one-way permutations. 
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Theorem (Goldreich- Levin, ’89). Suppose there is an efficient algorithm A, 
that given f'{x) = f{x),r for randomly chosen r,x, distinguishes (with non- 
negligible advantage) hr{x) from a completely random hit. Then there exists an 
efficient algorithm B, that inverts f{x) on random x with non-negligible proba- 
bility. 

If / is a one-way function, existence of such A would be contradictory. 

As established already in [4] , a way to improve efficiency in a PRG construc- 
tion would be to extract more than one bit per iteration of /. It is possible to 
output as many as m € O(logn) (where n = |a;|) bits, by multiplying the binary 
vector a; by a random mxn binary matrix, R. Denote the set of all such matrices 
Mm, and our functions are {Bff{x) \ R £ Mm}- That is, Bfffx) = R - x mod 2. 
The above thus leads to a general construction, given any one-way function. 



3 The Construction and Its Security 

3.1 The Basic PRG 

Definition 1. Let n, and m,L,X be integers such that L = Am and let f : 
{0,1}" ^ {0,1}". The generator BMCL^^ ^ ffix, R) stretches n nm hits to 
L bits as follows. The input is interpreted as xq = x and R € Mm- Let Xi = 
f{xi-i), i = 1, 2, . . . , A and let the output be {Bf({xi)})_i ■ 

A proof of the practical security for a concrete / and fixed n, m, requires a very 
exact analysis, and that analysis is the bulk of this paper. To begin with, we 
would like to relate the difficulty of inverting an iterated function / to that of 
distinguishing outputs of BMGL(^ ^ ^ from random bits. This is is made difficult 
by the fact that we no longer require / to be a permutation. However, under 
one additional and natural assumption on the “behavior” of /, we can bring the 
analysis one step further, relating the security of BMGL{^ ^ ^ more directly to 
the difficulty of inverting / itself. Our measure of success is as follows. 

Definition 2. For a function f : {0, 1}" ^ {0, 1}”, let f^’'\x) denote f iterated 
i times, f'^’'\x) = f{f^^~^Hx)), = x. 

Let A he a probabilistic algorithm which takes an input from {0, 1}" and has 

output in the same range. We then say that A is a {T, 5, i) -inverter for f if 

when given y = f^"‘\x) for an x chosen uniformly at random, in time T with 
probability S it produces z such that f{z) = y. 

Note that the number 2 ; might be on the form f^'^~^\x') but this is not required. 
It is interesting to investigate what happens for a random function. 

Theorem 1. Let A be an algorithm that tries to invert a black box function 
f : {0, 1}" ^ {0, 1}", and makes T calls to the oracle for f. If A is given 
y = fox a random x, then the probability (over the choice of f and x) 

that A finds a z such that f{z) = y is bounded by T{i -\- 1)2“". On the other 

hand, there is an algorithm that using at most T oracle calls outputs a correct z 

except with probability at most (1 — (z -I- 1)2“")^“* -I- z^2“". 
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Proof (sketch). For the lower bound on the required number of oracle calls, 
consider the process of computing and let W be the values occuring in 

this process. If an inverter does not obtain any w G W, there is no correlation 
between the inverter and the evaluation process. If the inverter makes T calls to 
the oracle, the probability of obtaining a m G IF is at most (i + 1)T2“” and this 
can be formalized. 

To construct an inverter, first assume that the i + 1 values seen under the 
evaluation of are distinct. This happens except with probability (over 

random /) < i^2“" and if it does not happen we simply give up. Now 

consider the following inverter. It is given y = /(*^(a;). Start by setting xq = 0” 
and Xj = f{xj-i) for j = 1,2,.. .. Continue this process until either Xj = y (and 
it is done) or xj is a value it has seen previously. In the latter case it changes xj 
to a random value it has not seen previously and continues. Each value it sees 
is a random value and if it ever gets one of the i + 1 values in IF, it finds the y 
within at most i additional evaluations of /. The probability of not finding such 
a good value in the T — i first steps is at most (1 — (i + 1)2“")^“*. □ 

Consider for instance the block cipher Rijndael [13] as a one-way function (fixing 
a message, mapping keys to cipher-texts). It is reasonable to expect that Rijndael 
is almost as hard to invert as a random function, so that the best achievable time 
over success ratio to invert it after being iterated i times would be, by the above, 
not too much smaller than 2"/i. The security is now defined as follows. 
Definition 3. A a-secure one-way function is an efficiently computable function 
f : {0, 1}” ^ {0, 1}", such that the average time over success ratio for inverting 
the ith iterate is at most cr2"/i. That is, f cannot he {T, 5, i) -inverted for any 
T/5 < uT^ji. 

A block cipher, f{k,p), jpj = jfcj = n, is called a-secure if the function fp{k), 
for fixed, known plaintext p, is a a-secure one-way function of the key k. 

Hence, for our “practical” choice, / = Rijndael, we expect it to be about 1- 
secure in the above terminology. Note also that if / is a permutation, only the 
case z = 1 is of interest and we have a standard notion of security. 



Security of the Generator. Our objective is to show that if m l 

not (L, T, i5)-secure for “practical” values of L, T, 6, then there is also a practical 
attack on the underlying one-way function /. In particular, we show the following 
theorem: 

£ 

Theorem 2. Suppose that G = BMGL( ^ is based on an n-bit function 
f, computable by E operations, and that G produces L bits in time S. Sup- 
pose that this generator can he {L,T, 6) -distinguished. Then, setting 5' = ^, 
there exists integers i < Ljm = A, 0 < j < 21ogd'“^, such that for k = 
max (to, 1 + log ((2n + 1)(5'“^) — j), / can be (T' ,dj /2,i)-inverted, where dj is 
given by (7) and (8), and T' equals 

(1 + o(1))2™+'=(2to + k + \ + T + S+ E){n + 1). 



Values of i and j such that f can he ((8 + o{l))T' ,dj /l(S,i)-inverted can, with 
probability at least 1/4, be found in time 0{5'~^{T -\- S')). 
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The time-success ratio for most ranges of S and T is worst when the value of j is 
small. For j £ 0(1) and m,k, E < S < 0{T) the ratio is 0(n^L^<5“^2’"T). The 
preprocessing time (to find i,j) is small compared to the running time except in 
the cases when j is large. In those cases the time to find j is still smaller than 
the running time of the inverter while the running time to find i might be larger 
for some choices of the parameters. 

A similar result could be obtained from the original works by Blum-Micali 
and Goldreich-Levin, but we are interested in a tight result and hence we have to 
be more careful than in [4] were, basically, any polynomial time reduction from 
inverting / to distinguishing the generator would be enough. Optimizations of 
the original proof also appeared in [9], but are not stated explicitly. 

The proof of Theorem 2 has two main components. We first show (Lemma 1 
below) that a distinguisher for BMGL can be turned into a distinguisher for 
B™(/(*-D(x)), given R, for some i. Then we show (Theorem 3) how this 

latter distinguisher is converted to an inverter for 

We thus start with the following lemma. 

Lemma 1. Let L = Am. Suppose that BMCL^^ ^ ^ runs in time S{L). If this 
generator is not (L,T{L), S)-secure, then there is an algorithm 1 < i < 
Ljm that, using T{L) + S{L) operations, given f^^\x),R, for random x € Un, 
R £ Mm, distinguishes Bff{f^'‘~^\x)) fromUm with advantage S' = 

depends on an integer i, and using ci5'~‘^{T{L)+S{L)) operations, where 
Cl is the constant given by (5), a value of i achieving advantage Si > S' /2 can 
he found with probability at least 1/2. 

We conjecture that the time needed to find i is optimal up to the value of the 
constant ci. Even if a good value i was found at no cost, the straightforward 
way by sampling to verify that it actually is as good as claimed would take 
time f2{S'~^{T{L) + S{L))). It is not difficult to see that the below proof can 
be modified to find an i with Si arbitrarily close to S'. The cost is simply an 
increase in the constant ci. 

Assuming for the moment the following Lemma (a proof is found in the 
Appendix), we can use it to show Lemma 1. 

Lemma 2. Let F be a function F : {0, 1}” x Mm ({0, 1}™)^; computable in 
time < S. Let H' he the distribution on ({0, 1}™)'^ induced by replacing the first 
im bits of F{x, R) by random bits. 

Suppose that (= F{x,R)) and (= {Um)^) are distinguishable with 
advantage S, by an algorithm D running in time T. Then, a value of i < X for 
which can he distinguished with advantage S/{2\), can with probability 

at least \, he found in time c\S'~^{T + S) where c\ is an absolute constant. 

For the moment, just note that the existence of such an i (and even slightly 
better advantage) follows directly from the triangle inequality. 

Proof. The proof uses the so called universality of the next-hit-test, by Yao [15], 
see also [ 1 ]. 

We assume we know the good value of i as in Lemma 2. Let F{x,R) = 
BMGLf^ {x, R). On input f^'\x), R, 7 , where 7 is either random, or, equal 
to Bff{fM^)(^x)) we do as follows. We easily generate an element according to 
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distribution as in Lemma 2, with the exception that the i + 1st m-bit block 
is assigned the value 7 . We feed this value to D and answers as it does. We see 
that precisely depending on whether 7 is random or not, we run D on an input 
from iL®, or, from and the lemma follows. □ 

We now give the theorem of Goldreich and Levin [4] trying to be careful with 
our estimates and construction. Apart from the value of the constants we have an 
improvement over previous results in the dependence on the parameter m. While 
previous constructions would yield a factor proportional to 2 ^”® we decrease this 
to 2™. The improvement is due to the fact that we treat the case of general m 
directly rather than reducing it to the case m = 1 (see later discussion) . 

The second main step towards Theorem 2 is: 

Theorem 3. Fixx. Suppose there is an algorithm, P, using T operations, when 
given random R distinguishes B'^(x) from random strings of length m with ad- 
vantage at least e where e is given. Then, for k = max (m, log (e“^(2n + 1))), we 
can in time 

(1 + o(l))2™+'®(2m + fc + 1 + T)(n + 1) 

produce a list o/ 2 ^+”®(n+ 1 ) values such that the probability that x appears in 
this list is at least 1 / 2 . 

As we understand, a statement similar (upto a constant), for the special case of 
m = 1, can be derived from [9]. In most application one has m < log(e“^(2n+l)) 
and thus the latter value of k should be considered standard. 

We now collect the last pieces for the proof of Theorem 2 by proving the 
above Theorem 3 which, in turn, relies on the following prelimnaries. 

Lemma 3. Fix any x € {0,1}". For m < k, from m k randomly chosen 
Oq, . . . , Om-i and 6 q, . . . , b^-i G {0, 1}", it is possible in time 2 m 2 ^ + + m + 

4fc to generate a set of 2* uniformly distributed, pairwise independent matrices 
R^,...,R? G M.m- Furthermore, there is a collection of m x {m-\-k) matrices 
and a vector z G (0, 1}™+'= such Bffj{x) = MjZ for all j. 

The proof is given in the Appendix. The construction generalizes that of Rackoff 
for the case m = 1, see [2]. If A: < m, we use k' = m above and then simply only 
take the first 2 ^ matrices. 

Lemma 4. Let P be an algorithm, mapping pairs A4m x { 0 , 1 }”® ^ { 0 , 1 }, 
whose running time is T, let R^,Mj be the matrices generated as described in 
Lemma 3 and let S = {S/}^_]^ be an arbitrary matrix set in Mm- 
In time 2”®+^(2m+A:+T) it is possible to compute 2"®+^ values, 
such that for at least one I we have ci = Ej[P{R^ Sj, B'ffj{x))]. The value of I 
is independent of S. 

The role of the set S is explained shortly. 

Proof. First run P on all the 2"®+^ possible inputs of form {R^ -\-Sj,r) and record 
the answers: {P{R^ + Sj,r)}. A fixed value of I above corresponds to a value of 
the m -\- k bits Zi in Lemma 3. Let us assume that zi is the correct choice, i.e. 
Bffjfx) = MjZi. We define 

2'“— 1 2 '“ — 12'"— 1 

Cl ^ 2-'® ^ P{R^ + S„M,zi) = 2-'= ^ ^ P{W + S„r)A{r, MjZi), (1) 

j— 0 i— 0 r—0 
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where A(r,r') = \ ii r = r' and 0 otherwise. The naive way to calculate this 
number would require time but we can do better using the Fast Fourier 

transform. First note that Z\(r, r') = X)aC[o m-i] (—1)^’’®’’ This implies 
that the sum (1) equals 

Cl = 2-1™+'=) 

j,r,ct 

^ 2-{m+k) P{W + 

j,a r 

Let Q{j, a) be the inner sum and fix a value of j. Notice that each a- value then 
correspond to a Fourier transform and hence the 2™ different numbers Q{j,a) 
can be calculated in time m2’” for this fixed j and hence all the numbers Q{j, a) 
can be computed in time m2^®’”. Finally we have 

j,a j,a 

where Mj' is the transpose. But this is just a rearrangement (induced by Mj') 
of the standard Fourier-transform of size 2^+’" and can be computed with {k + 
m)2^+’” operations. The lemma follows. □ 

We prove now that we can compute useful information about x. 

Lemma 5. Let P, T, x and e be as in Theorem 3. Then for any set of N vectors 
{vi}iLi C {0, 1}" and any k > m we can in time (1 -I- o(l))2’”®^(2m -I- fc -I- T -|- 
l)(iV -I- 1) produce a set of lists j = 1,2, .. . ,2^+"’(iV -|- 1) such that 

with probability 1/2 we have for at least one j, {x,Vi )2 = b[^\ except for at most 
^ 2^-1 of the N possible values of i. 

Proof. Start by randomly generating the 2^ matrices {W} as shown in Lemma 3. 
Now repeat the process below for each i = 1, ... ,N. Select 2^ (pairwise) inde- 
pendent random strings s’ € {0, 1}’", and let S'* be the m x n matrix defined 

by S’ = s’ 0 Vi (the outer product, i.e. {Sj)k,i = {s])k • {vi)i). Notice that by 
linearity 

{R^ + S*)x = R^x + s’ (u„ x) 2 , (2) 

which is Bffj{x) if {vi,x )2 = 0, and a random string otherwise. 

As described in Lemma 4, we now compute the values {c)|. 

2^-1 

c’ = 2-'= ^ P{R^ +S},MjZi). 

3=0 

Focus on the correct choice for 1. If {vi,x )2 = 0, then c] is the average of a 
uniformly random, pairwise independent sample of the distinguisher P on inputs 
of the form {P{R, Bff{x))}. On the other hand, if {vi,x )2 = 1, it is a sample of 
{P{R,u)} over random u. 

Suppose pr is the probability that P outputs 1 when the m bits are picked as 
Bff{x) and let pu be the same probability when the m bits are picked randomly. 
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Let p = {pr + pu)l‘^- Note that we do not know the value of p. We deal with 
this problem later, so for the moment suppose we do. 

We guess that {vi,x )2 = 0 if c| > p and {vi,x )2 = 1 otherwise. The choice 
is correct unless the average of 2^ pairwise independent Boolean variables is at 
least e/2 away from its mean. By Chebychev’s inequality the probability that 
this happens is bounded by 2“^e“^. 

This implies that for the correct values of I and p, the expected number of 
errors is 2“^e“^iV, and by Markov’s inequality, with probability at least at 1/2 
it is below N . There are possible values of I and once I is fixed 

the only information on p needed is for which i G [1..N] we have c] > p (if any). 
Thus, there are only N + 1 such choices. 

The time needed to construct the matrices is negligible, computing the values 
c] can be done it time 2*+™(2m + k + T)N, and at most time 2^+'"(N + 1) is 
needed to output the final lists. □ 

We finally establish Theorem 3. 

Proof (of Theorem 3). Set k = max (m, log (e“^(2n + 1))). We apply Lemma 5 
with N = n, and let {vi}(Li be the unit vectors so that {vi, x )2 gives the zth bit 
of X. With probability 1/2 one list gives all inner-products correctly and hence 
determine x. □ 

We can now use Theorem 3 and Lemma 1 to establish Theorem 2, see the 
Appendix. 

Instead of applying Lemma 5 with the unit vectors we can, as suggested 
in [2], use it with {uj} describing the words of an error correcting code, e.g. a 
suitable Goppa-code, [10]. (Similar ideas appears in [8].) If we have code words 
of length N , containing n information bits, and we are able to efficiently correct 
e errors we get the following variant of Theorem 3: 

Theorem 4. Fix x. Suppose there is an algorithm, P, that using T operations 
given R distinguishes B'ff{x) from random strings of length m with advantage 
e where e is given. Suppose further we have a linear error correcting code, with 
n information bits, N message bits that is able to correct e errors in time Tq. 
Then setting k = max (m, log (e“^(2N -|- l)/e)) we can in time 

(1 + o(l))2™+'=(2m + fc + 1 + T + Tc){N + 1) 

produce a list o/2^+™(A^ -I- 1) numbers such that the probability that x appears 
in this list is at least 1/2. 

Proof. We apply Lemma 5 with the given value of k and given by the 

row vectors of the generator matrix of the error correcting code. Running the 
decoding algorithm on each obtained “codeword” gives a list as claimed. □ 

Similar to Theorem 2, this translates to the quality of the inverter. We only state 
the resulting algorithm in existential form using O-notation. 

Theorem 5. Suppose we have a linear error correcting code with n information 

bits, 0(n) message bits that is able to correct f2(ji) errors in time Tc and that 
£ 

G = BMGLf ^ R is based on an n-bit function f, computable by E operations, 
and that G produces L bits in time S. If G can be {L,T, 6) -distinguished then. 
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with 6' = ^, there is an i < Lim = A and 0 < j < 21ogi5' ^ such that for 
k = max(m, 0(1) -I- 2 log 6'~^ — j) such that f can be (T', I7(2“l/^(j -|- 1)“^), z)- 
inverted where T' equals 

0(2'=+™(fc + m + S + T + E + Tc)n). 

In particular, this implies that the asymptotic time-success ratio decreases by a 
factor n for the parameters discussed after Theorem 2. 



3.2 Applying the GGM Gonstruction 

As shown, the BMGL generator can produce any number of output bits. We 
here investigate an alternative way, inspired by a construction of pseudo random 
functions due to Goldreich, Goldwasser, and Micali, [3]. It has the advantage 
that we iterate / fewer times and hence the assumption needed for security is 
weaker. 

The construction can be based on any PRG, G : {0, 1}” ^ {0, 1}^", though 
we for concreteness think of G = G{x,R) = BMGL{^ ^ some /. 

For simplicity of notation, we shall exclude R from it, keeping in mind that 
probabilities should be taken also over the choice of R. First, let us assume that 
we know in advance how may output bits that are desired. We apply [3] to obtain 
2'^n output bits (where d is given) from n(m -I- l)-bits. 

Definition 4. Fix n,d G TN. Let G{x) be a generator, stretching n bits to 2n 
bits, and let Gq{x) (Gi{x)) be the first (last) n bits of G{x). For x G {0,1}", 
s G {0,1}'^ put g„{s) = Gs^(Gs^_j(- • • Gs 2 (G,,^(x)) • • •)), and define GGM(f^^ : 

{0,1|"^{0,1}2‘^" &Z/ 

GGM(fJx) ^ g,(00 . . . 0), g,(00 . . . 1), • • • ,5.(11 ... 1) 

(the concatenation of g^ applied to all d-bit inputs). 

The construction can be pictured as a full binary tree T = (V, E) of depth 
d. Associate v G V with its breadth- first order number; the root is 1 and the 
children of v are 2v,2v + 1. Given x, the root is first labeled by £(1) = x. 
For a non-leaf v labeled £(v) = y G {0, 1}", label its children by £(2v) = 
Go{y), £{2v -I- 1) = Gi{y), respectively. The output of GGM(fn is simply the 
concatenation of all the “leaves” of the tree. 

Notice an advantage of the above method in the case that G = BMGL^^^^ ^n- 
To produce L = 2‘^n bits, each application of G iterates / 2n/m times instead 
of 2'^n/m, which, in light of Theorem 1, retains more of the one-wayness of /. 

Lemma 6. Suppose that Di is a {2'^n,T,6)-distinguisher for GGM^^(x) where 
G can be computed in time S. Then, there is an integer i <2'^ and algorithm D* 
that is an {2n,T + 2‘^S,2~'^6)-distinguisher for G. 

D'‘ depends on i, and a value of i achieving advantage 5i > can be 

found with probability at least 1/2 in time Ci2^'^(5“^(T -|- 2'^S) where Ci is the 
constant given by (5). 




452 



J. Hastad and M. Naslund 



Proof (sketch). Consider the binary tree T, describing a computation of GGM(fn 
as above. The tree has depth d, 2'^— 1 internal vertices and 2'^ leaves. We construct 
hybrid distributions on the vertex-labels of such trees. Again, 

associate each v £ V hy its breadth-first order number. Then, iJ* is defined by 
a simulation algorithm, GGM^{x), which on input x, assigns labels as follows. 
Assign the root, u = 1, the label x. For v G V, v = 1,2, . . . ,i, label v’s children 
by letting C(2v), C(2v + 1) be independent, random n-bit strings. Then, for 
V = i + 1, . . . ,2‘^ — 1: C{2v) = Go{C{v)), C{2v -I- 1) = Gi{C{v)). Finally return 
the labels of the leaves in T. 

Observe that gives the uniform distribution over the node labels (in 

particular, over the leaves) and labels the vertices exactly as GGM^j^ does on 
a random seed x. Since Di distinguishes GGM^^{x) from random 2‘^n-bit strings 
with advantage S, for some i < 2‘^, it must be the case that Di distinguishes 
with advantage at least 2~'^6. 

Finding i is now done in complete analogy with Lemma 2, letting the function 
F there correspond to the node labeling. 

We now construct when gets input 7 G {0, 1}^”, it selects random x 
and feeds Di a value y, computed as GGM*+^(a;) with the following exception: 
i - 1-1 is not assigned any label^, and the children of z - 1-1 are assigned the left /right 
n-bit half of 7 respectively. It is not too hard to see that if 7 is random, we give D\ 
a value according to exactly the same distribution as whereas if 7 = G(x'), 

Di is given a value from the same distribution as GGM'^(x), i.e. iL*. Thus, by 
returning Dfs answer to y, D'^’s advantage equals that of Hi. □ 



Unknown Output Length. If the length of the “stream” is unknown be- 
forehand, we let the basic generator G expand n bits to 3n bits. Apply the 
tree-construction as above, labeling left/right children by the first, respectively 
second n-bit substring of G’s output. The remaining n bits are used to pro- 
duce an output at each vertex as we traverse the tree breadth- first. The analysis 
is analogous. To save memory, the traversal can be implemented in iterative 
depth- first fashion. 

3.3 Concrete Examples 

What does all this say? Suppose that we base the construction on Rijndael(a:) = 
Rijndael 2 ,(p) (for a fixed plaintext p) and that we want to generate L = 2^° bits, 
applying our construction with m = 32 (32 bits per iteration). One choice of 
parameters gives the following corollary. 

Corollary 1. Consider G = (using key/block length 256/ and 

where Rijndael is computable by E operations, and assume that G runs in time 
S. If G can be (2^°, T, 2~^‘^) -distinguished, then there is i < 2^^, and 0 < j < 114 
such that setting k = max (32, 123 — j), Rijndael can be {T' ,dj,i) -inverted (dj 
given by (7) and (8)) for T' = 2‘*^+*(65 -\- k -\- T -\- S -\- E) . 

® As the labels of non-leaves are never exposed, one can conceptually think of the 
process as labeling z -I- 1 afterwards. 
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Similarly, setting G' = SAfGL 25 ^ 32 , 5 i 2 then using GGM§^ 255 (lo gen- 
erate the same length outputs), the result holds for some i < 16. 

This is simply substituting the parameters and noting that the o(l) in Theorem 2 
comes from disregarding the time to construct the matrices described in Lemma 3 
and for the current choice of parameters using (1 -|- o(l))(n +1) <2® is an 
overestimate. 

Assuming we have a simple statistical test such as Diehard tests, [11], or 
those by Knuth, [7], it is reasonable to assume that 65-|-A:-|-T-|-if< S. From 
the first part of the corollary, then, the essential part of computing the generator 
comes from the 2^^ computations of Rijndael and we end up with a time for the 
inverter equivalent to at most 2®^“*'^ Rijndael computations. The maximum of 
is obtained for j = 5 in which case it equals 2^^"^ • 7.5 < 2^^^. We 
conclude that in this case we get a time-success ratio that is equivalent to at 
most 2^®^ computations of Rijndael and since i < 2^®, Rijndael would not be 
2“^^-secure. 

Alternatively, bootstrapping the BMGL construction by the GGM method, 
we conclude from the second part of the corollary that such a test would mean 
that Rijndael cannot be even 2“®^-secure. Thus, though somewhat more cum- 
bersome to implement, the GGM method is more security preserving. 

If we want to find the values of i and j efficiently the ratio increases by a 
factor 2® . Note that for the case with small j the time needed to find i and j is 
much smaller than the running time of the inverter. 

4 Discussion 

4.1 Choice of f 

To implement the generator in practice, we suggest to base the one-way function 
on Rijndael. First of all it is widely believed to be secure and has shown to 
be very efficient. (A trial implementation of BMGL gives speeds in the range 
2 — lOMb/s on a standard PG, depending on choice of m.) Secondly, as our 
construction requires that the block size of the cipher is equal to the key size, 
the fact that Rijndael supports both 128 and 256-bit block size is advantageous, 
as it makes it possible to vary the security parameter (key size). 

Again note that the one-way function we suggest to use is to fix a message, 
p, let the input be the encryption key, x, and the output the cipher-text. To 
obtain a permutation and at the same time increased speed, it might appear 
to be better to have the mapping from dear-text to crypto-text and iterate 
fx{p) rather than fp{x). The problem is that this is by definition not a one-way 
function: anybody that can compute it can invert it. A possibility is also to use 
an efficient cryptographic hash function as /. 

4.2 Decreasing Seed Size 

The impact on security of varying m is clearly visible in the above theorems. 
Though increasing speed, a practical problem with a large m is the seed size; nm 

* Common “practical” tests are almost always much faster than the generator tested. 
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bits specifies a matrix R. First note though, that the security does not depend 
on the fact that R is secret; only that it is random. 

It is possible to decrease the number of bits to only n by instead of binary 
matrix multiplication, performing a multiplication by a random element in the 
finite field 1F2»», and selecting any fixed set of m bits of this, see [12]. A drawback 
of this construction is that instead of the direct reduction from a distinguisher 
for B^{x) to a predictor for {vi,x )2 (Lemma 5), the restricted sample-space 
of elements makes us need to use the so called Computational XOR-Lemma, 
[14]. Unfortunately, this reduces the initial d-advantage of the distinguisher to a 
2“™i5-advantage for the predictor for (vi,x) 2 , and when the smoke clear we lose 
a factor 2™ in the running time of the inverter. 

An alternative, suffering the same security drawback, is to pick i? as a random 
Toeplitz matrix, specified by n -I- m — 1 bits, [4] . 



5 Summary and Conclusions 

We have given a careful security analysis of a very natural pseudorandom gen- 
erator. Apart from optimizing known constructions and analysis we have intro- 
duced a new analysis method when several bits are output for each iteration of 
the one-way function. 

Another common method to derive PRGs from a block cipher is to run it in 
counter mode. Though addmitedly simpler, the proof of such constructions relies 
on the assumption that the core, /, is a pseudo-random function. The strictly 
weaker type of security assumption we have proposed (a function being one-way 
on its iterates), although it has been proposed before by Levin, is for the first 
time made in a quantitative sense and we believe that this concept will be useful 
for future study of one-way functions. 

Acknowledgment. We thank Bernd Meyer, Gustav Hast, and anonymous re- 
viewers of different versions of this paper for helpful comments. 
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A Additional Proofs 

Proof (of Lemma 2). Let Si be D’s advantage on iL*, The problem is that 

even though = S/X = S', there is a large number of possibilities for the 

individual Si. Basically, these possibilities all lie between the two extreme cases: 
(1) There are a few large Si, while most are close to 0. (2) All Si are about the 
same, but none is very large. Suppose we try random i’s. In the first case, we 
may need to try many i, but it can be done with a rather low sampling accuracy. 
In the second case, we expect to find a fairly good i rather quickly, but we need a 
higher precision in the sampling. The idea is therefore to divide the sampling into 
a number stages, {*S'(j)}j> 0 ) each with different sampling accuracy. Stage S{j) 
chooses some random Lvalues and samples D on inputs generated from iJ* , . 

As soon as a sufficiently “good” i is detected, the procedure terminates. Below 
we quantify the needed accuracy and the criterion for selecting the good i. 

For j G {0, 1, . . . , —2 log (5'} let aj be the fraction of i such that Si > 2^^~^'>/‘^S' . 
By the assumption of the lemma we have 

OO 

ao + _ 20 - 2 )/ 2 ) > i _ 2 - 1 / 2 . ( 3 ) 

i=i 

Define bo to be [4(1 — 2-i/^)-i] and 

bj = [4(1 - 2-i/2)-1(20-i)/2 _ 2(i-2)/2)] = [2(J+3)/2], 

for j > 0. The 5j-values, together with a parameter Tj now define the sampling 
accuracy. Given these values, we determine i as follows. 

In stage S{j), j = —2 log 5', —2 log 5' — 1,...,0 choose bj different random 
values of i and sample iL* and iL*+i each TjS'~^ times and run D on each of 
the samples. If the difference in the number of 1-outputs is at least — 

y/Tj /2)(5'-i choose this i and halt. If no i is ever chosen halt with failure. We 
need to analyze the procedure and determine Tj. 
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Suppose that at stage j an i is picked such that Si > . We claim 

that the algorithm halts with this i as output with probability at least 1/2. To 
establish this first consider the following fact, the proof of which we leave to the 
reader. 

Fact. Let X he a random variable with mean fj, and standard deviation a. Then 
we have 

Pr[X <ii-a]< 1/2. 

From this, the above claim now follows since the expected difference in the 
number of 1-outputs when Si > is at least 2^^~^'>^‘^TjS'~^ and the 

standard deviation (being the sum of TjS'~‘^ variables each being the difference of 
two 0/1-valued variables) is at most S'~^ \jTjj2. This implies that the probability 
that the algorithm halts for an individual iteration during stage j is at least Oj /2. 
The probability that algorithm will fail to output any number is thus bounded 

by 

Y[{1 - aj/2p < < g-2^ 

3 

where the last inequality follows from (3) and the definition of bj . 

We must bound the probability that algorithm terminates with an i such 
that Si < S' /2. Let us analyze the probability that such an i would be output 
during an individual run of stage j provided that it is chosen as a candidate. 
The expected difference of the number of 1-outputs in the two experiments is 
at most TjS'~^ ! 2 and we have to estimate the probability that it is at least 
— y/Tj I2)S'~^ . This is, provided 



y^.(2(i-i)/2 _ 1/2) - ijTj/2 > 0, 



by a simple invocation of Chernoff bounds, at most 



e 



(Tj / 2)2 



(4) 



Let us call this probability pj. The overall probability of ever outputting an i 
with Si < S' j2 is bounded by 

3 

We now define Tj to be the smallest number satisfying (4) such that pj < 
2-0+3)J)“1 and such that TjS'~‘^ is an integer. We get that with this choice 
the probability of outputting an i with Si < S' /2 is at most 1/4 and hence the 
probability that we do get a good output is at least (1 — e“^)| > .64. The total 
number of samples of the algorithm is bounded by c\S'~^ , where 

ci^2^6,T,. (5) 

3 

Note that this sum converges since Tj G 0{j2~^) and bj G 0(2l/^). In fact, 
it can numerically be calculated to be bounded by 5300. Moreover, the sum is 
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completely dominated by the first term which is over 4600, and the sum of all 
but the first three terms is bounded by 250. Thus, a more careful analysis what 
to do for small j could lead to considerable improvements in this constant. □ 

Before we continue let us make some needed definitions. Let bin(i) be the 
map that sends the integer t, 0 < t < 2™ to its binary representation as an 
TO-bit string. In the sequel, we perform some computations in IF2fc, the finite 
field of 2^ elements, represented as Z 2 [t]/{q{t)) where q{t) is a polynomial of 
degree k, irreducible over ^ 2 - We assume that such q is available to us. If not, 
it can be found in expected time at most which is negligible compared to 
our other running times considered. Viewing IF2fc as a vector space over IF2, for 
any 7 = 7*^* € IF2fc, we let in the natural way bin(7) denote the vector 

(70 , . . . , 7fc-i) corresponding to 7’s representation over the standard polynomial 
basis. Note also that bin(7) can be interpreted as a subset of [0..k — 1] in the 
obvious way. 

Proof (of Lemma 3). First choose randomly and independently m n-bit strings, 
oo, . . . , Om-i and k strings bo, , bk-i, each also of length n. The jth matrix, 
W is now defined by {fli}, {bi}, and an element aj G IF2ii as follows. Its ith row, 
R(, 0 < i < m, is defined by 

Rl=a,(B (©iebin(a,.t7^;) > 

where aj is the lexicographically jth element of IF2fc (i.e. the lexicographically 
jth binary string), and the multiplication, aj ■ P, is carried out in IF2fc, and 0 is 
bitwise addition mod 2. 

Clearly the matrices are uniformly distributed, since the Oj are chosen at 
random. To show pairwise independence it suffices to show that an exclusive-or 
of any subset of elements from any two matrices is unbiased. Since the columns 
are independent, it is enough to show that the exclusive-or of any non-empty 
set of rows from two distinct matrices R^^ and R^'^ is unbiased. Take such a set 
of rows. Si C R^^, and S 2 C Rf^ . We may actually assume that Si = S 2 = S, 
say, since otherwise, the a- vectors makes the result uniformly distributed. In this 
case the xor can be written as 

®iGS ®J6bin((ajj-eaj2).t*) 



but this is the same as 



®iebin((a^i-ha,-J.(Eies*b)^'’ 

which is unbiased if, and only if, bin((ajj 0 However, 

7^ ^ji 7^ 7^ ^ have two nonzero 

elements and hence their product is nonzero. 

Notice that if we know '^^auXi and '^^buXi mod 2 for all ai,bi (a total 
of m + k bits), then by the linearity of the above construction, we also know 
the matrix-vector products R^x for all j. To calculate all the matrices we first 
compute the reduction of for all z = fcT 1, . . . , 2k in GF[2^]. Using an iterative 
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procedure this can be done with 3fc operations on k bit words and since we only 
care about k < n these can be done in unit time. Now generate the vectors a and 
b in time m + k operations. Then we compute for each i = 0, ... ,2k 

using k^ operations. By using a gray-code construction each row of a matrix can 
now be generated with two operations and thus the total number of operations 
is 2 to2^ + k"^ + m + 4fc. □ 

Proof (of Theorem 2). First we apply Lemma 1 to see that there is an i for which 
we have an algorithm that when given fb^\x) runs in time S{L) + T{L) and 
distinguishes Bf({f^'^~^\x)) from random bits with advantage at least 6", where 
S" is S' /2 or 6' depending on whether we want to find i efficiently, or only show 
existence (i.e. uniform/non-uniform algorithm). Since 6" is an average over all 
X we need to do some work before we can apply Theorem 3. 

For each x we have an advantage Sx- Let aj be the fraction of x with Sj > 
2 (i-i)/ 2 | 5 'L Since the expected value of Sx is S" we have 

OO 

ao + ^a,-(2(^-i)/2 _ 2(f-2)/2) > i _ 2 - 1 / 2 . 

i=i 



Now define 



<!„ ^ 1(1 - 2-i/») 



and 



d,^(2j'(j + l)2(i-i)/2)-i 



( 7 ) 

(8) 



for j > 1. Since 



do + E - 2(i-2)/2) = 1 - 2 - 1 / 2 , (9) 

we must have aj > dj for some j and this is our choice for j in the existential 
part. We now apply Theorem 3 with e = To eliminate the list we 

&pply / to each element in it to see if it is a correct pre-image in which case it is 
output. Since whenever da, > e we have a probability 1/2 of having f^'~^\x) in 
the list and hence the probability of being successful for a random x is at least 
dj(2. 

To get a uniform algorithm, we need to sample to find a suitable value of j. 
Consider the following procedure for parameters d and Tj to be determined. 

For j = — 21ogd", — 21ogi5" — 1, ... ,0 choose d{j + i)d~^ different random 
values of x and run for each x, TjS"~^ each on the two distributions given by 
choosing the m extra bits as Bf({f^'~^\x)) or as random bits. If the difference 
in the number of 1-outputs for the two distributions is at least — 

ySTjj2)6"~^ for at least d{j -I- 3)/4 different values, choose this j and apply the 
algorithm of Theorem 3 with e = 2//-2)/^(5" = . 
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First we analyze the probability that the algorithm outputs j if it ever gets 
to a stage where aj > dj. For each x chosen, the probability that it will satisfy 
dx > and yield the desired difference is by the choice of j and Fact A, 

at least aj/2 > dj/2. Thus, for sufficiently large d, with probability at least 
1 — this desirable distance will be detected d{j + 3)/4 times and j will 

be output. Hence, except with this probability the algorithm will produce some 
output and we have to analyze the probability that a worse j is output at an 
earlier stage. 

We claim that unless aj-i > dj/8, the probability of j being output is 
2-0+3). Suppose that aj-i < dj/8 and consider an individual execution in stage 
j. For a suitable choice of Tj we will prove that the probability that we observe 
a difference greater than — sjTj]2d)d''~^ is bounded by dj/6. This is 

sufficient, for large enough d, to establish the claim. 

By assumption dx < except with probability dj/8 and thus we 

need to prove that given that this inequality is true, the probability to get the 
desired difference is at most dj/24. By assumption the expected value of the 
observed difference is 2^^~‘^'>/^Tjd"~^ , and by applying Chernoff bounds it is 
hence sufficient to choose Tj large enough so that 



This can be done with Tj = 0{{j + 3)2 ^). The expected number of samples 
computed, given that jo is the largest value such that aj„ > dj„, is at most 

OO jo — 1 

d{j + 3)d-%d"-^ + 2-(^«+3) d{j + 3)d-^T^d"-^, 

j=io j=o 

which is O{jo2~^°^^d'~^). 

In the case where we efficiently find i and j, the final value of e for which we 
call upon Theorem 3 is a factor 2“^/^ smaller than in the existential case, and 
hence the increase in the running time is increased by a factor 8-|-o(l), where the 
o(l) comes from the increase in the additive term k. By the above argument the 
guarantee for the fraction of the inputs for which the procedure has probability 
at least 1/2 of finding the inverse image, is at least 1/8 of that in the existential 
case. □ 
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1 Introduction 

Different types of ciphers use Boolean functions. So, LFSR based stream ciphers 
use Boolean functions as a nonlinear combiner or a nonlinear filter, block ci- 
phers use Boolean functions in substitution boxes and so on. Boolean functions 
used in ciphers must satisfy some specific properties to resist different attacks. 
One of the most important desired properties of Boolean functions in LFSR 
based stream ciphers is correlation immunity introduced by Siegenthaler [13]. 
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Another important properties are nonlinearity, algebraic degree and so on. For 
Boolean functions used in block ciphers the most important properties are non- 
linearity and differential (or autocorrelation) characteristics (propagation degree, 
avalanche criterion, the absolute indicator and so on) based on the autocorre- 
lation coefficients of Boolean functions. Note that in recent research differential 
characteristics are considered as important for stream ciphers too. 

Correlation immunity (or resiliency) is the property important in cryptogra- 
phy not only in stream ciphers. This is an important property if we want that 
the knowledge of some specified number of input bits does not give a (statistical) 
information about the output bit. In this respect such functions are considered 
in [6], [3] and other works. 

Many works (see for example [5] ) demonstrate that correlation immunity and 
autocorrelation characteristics are in strong contradiction. Some of results in our 
paper confirm it. Nevertheless, it appears that autocorrelation coefficients of a 
Boolean function is a power tool for the investigation of correlation immunity 
and other properties even without a direct relation to differential characteristics. 
The results of our paper demonstrate it. 

In Section 2 we give preliminary concepts and notions. In Section 3 we prove 
new lower bound A/ > ^ ^ 2” for the absolute indicator of resilient func- 

tions that improves significantly (for m > (n — 3)/2) the bound of Zheng and 
Zhang [18] on this value. In Section 4 we prove that the number of nonlinear vari- 
ables in n- variable (n — fc)-resilient Boolean function does not exceed (/c — 1)2*“^. 
This result supersedes the previous record n < {k — 1)4*“^ of Tarannikov and 
Kirienko [16]. As a consequence we give the sufficient condition on m and n that 
the absolute indicator of n- variable m-resilient function is equal to the maximum 
possible value 2”. In Section 5 we characterize all possible values of resiliency 
orders for quadratic functions, i. e. functions with algebraic degree 2 in each 
variable. In Section 6 we give a complete description of quadratic n-variable 
m-resilient Boolean functions that achieve the bound m < ^ — 1. In Section 7 
we establish new necessary condition that connects m, n and the weight of an 
n-variable unbalanced nonconstant mth order correlation immune function and 
prove that such functions do not exist for m > 0.75n— 1.25. For high orders of m 
this surprising fact supersedes the well-known Bierbrauer-Friedman bound [8], 
[1] and was not formulated before even as a conjecture. In Section 8 we prove 
that for m > \n+ \ log2 n + ^ log2 (f e®/®) — 1, n > 12, the nonlinearity of an 
unbalanced mth order correlation immune function of n variables does not ex- 
ceed 2”-i -2™+i, and for m > f log2n-klog2 Q + ^) + 5 log2 (f e®/®) -2, 
n > 24, this nonlinearity does not exceed 2"“^ — 2"^+^. These facts improve sig- 
nificantly correspondent results of Zheng and Zhang [18] and demonstrate that 
for higher orders of resiliency the maximum possible nonlinearity for balanced 
functions is greater than for unbalanced. 

Along all paper we apply actively autocorrelation and Walsh coefficients for 
the investigation of correlation immune and resilient Boolean functions. Our new 
results demonstrate the power of this approach. 
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2 Preliminary Concepts and Notions 

We consider F2", the vector space of n-tuples of elements from F2. An n-variable 
Boolean function is a map from ^2” into -F2. The weight of a vector x is the 
number of ones in x and is denoted by |x|. We say that the vector x precedes to 
the vector y and denote it as x ^ j/ if < yi for each i = 1 , 2 , . . . , n. The scalar 

n 

product of vectors x and u is defined as < x,u >= Yl XiUi. 

i —1 

The weight wt{f) of a function / on T2” is the number of vectors x on T2" 
such that f{x) = 1 . A function / is said to be balanced if wt{f) = wt{f 0 1 ) = 
2 "“^. A subfunction of the Boolean function / is a function /' obtained by 
substituting some constants for some variables in /. 

It is well known that a function / on T2” can be uniquely represented by 
a polynomial on F2 whose degree in each variable in each term is at most 1 . 
Namely, 

/(xi, . . . ,x„) = 0 5(ai,...,a„)xf ...x“" 

(ai,...,o„)e-F 2 " 

where g is also a function on T2”. This polynomial representation of / is called 
the algebraic normal form (briefly, ANF) of the function and each x^^ 
is called a term in ANF of /. The algebraic degree of /, denoted by deg(/), 
is defined as the number of variables in the longest term of /. The algebraic 
degree of variable Xj in /, denoted by deg(/, x^), is the number of variables in 
the longest term of / that contains x^. If deg(/, Xj) = I, we say that / depends 
on Xi linearly. If deg(/, x^) yf 1 , we say that / depends on Xi nonlinearly. A 
term of length 1 is called a linear term. If deg(/) < I then / is called an affine 
function. If / is an affine function and /(O) = 0 then / is called a linear function. 

Definition 1 . We say that the Boolean function f is quadratic if an algebraic 
degree of each variable in f is 2 , i. e. if deg{f, Xi) = 2 for each i = 1 , 2 , . . . , n. 

The Hamming distance d(xi, X2) between two vectors Xi and X2 is the number 
of components where vectors x\ and X2 differ. For two Boolean functions fi and 
/2 on F2", we define the distance between fi and /2 by d(/i,/2) = #{x G 
F2^\fi{x) yf /2(x)}. It is easy to see that d(/i, /2) = wt{fi © /2). The minimum 
distance between / and the set of all affine functions is called the nonlinearity 
of / and denoted by nl{f). 

Definition 2 . The Walsh Transform of a Boolean function f is an integer- 
valued function over ©2" that can be defined as 

Wf{u)= 

XGF2" 



Walsh coefficients satisfy ParsevaTs equation Y Wj{u) = 2 ^”. 

uGF2" 
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Lemma 1 . Let f be an arbitrary Boolean function on ■ Then 

wt{f) = 2"-i - ^-Wf{0). 

It is well known that nl{f) = 2”“^ — | max \ Wf{u)\. 

A Boolean function / on ¥ 2 ^ is said to be correlation immune of order m, 
with 1 < TO < n, if the output of / and any to input variables are statistically 
independent. This concept was introduced by Siegenthaler [13]. In equivalent 
non-probabilistic formulation the Boolean function / is called correlation im- 
mune of order to if wt{f) = wt{f)/2"^ for any its subfunction /' of n — to 
variables. A balanced mth order correlation immune function is called an m- 
resilient function. In other words the Boolean function / is called TO-resilient if 
wt{f) = for any its subfunction /' of n — to variables. In [9] a char- 

acterization of correlation immune functions by means of Walsh coefficients is 
given: 

Theorem 1 . [9] A Boolean function f on F 2 ” is correlation-immune of order 
TO if and only ifWf{u) = 0 for all vectors u € T 2 ” such that 1 < juj < to. 

Theorem 2. [12] If f is an mth order correlation immune function on T 2 ”; 
TO < n — 1, then Wf{u) = 0 (mod 2™+^). Moreover, if f is m-resilient, to < 
n — 2, then Wf{u) = 0 (mod 2™+^). 

Definition 3. Let f be a Boolean function on ^ 2 ”. For each u € T 2 ” the auto- 
correlation coefficient of the function f at the vector u is defined as 

Af{u)= 

a:eF2" 

Zhang and Zheng [17] proposed the idea of Global Avalanche Characteristics 
(GAG). One of important indicators of GAG is the absolute indicator. 
Definition 4 . Let f be a Boolean function on Ff^ ■ The absolute indicator of f 
is defined as 

At = max |Z\f(a:)|. 

3 New Lower Bound for the Absolute Indicator of 
Resilient Functions 

In this section we prove new lower bound for the absolute indicator of resilient 
functions. At first, we establish an important technical formula. Note that this 
formula can be deduced from the relation lV?(x) = ^ (—If^'^’^^Affu) given 

U&F2" 

in [5] and [4] but we prefer to give a direct proof in the Appendix A. 

Theorem 3. 

A/(u) = -2" + 2i-" ^ Wf(x). 

xGF2'^ 

<_x,u>=0 (mod 2) 
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We denote by e* the vector of the length n that has an one in zth component 
and zeroes in all other components. 

Lemma 2. Let f be an m-resilient Boolean function on ■ Then 



Af> 



2m — n + 2 



2 ". 



Proof. We form the matrix B with n column writing in rows of B each 
binary vector u G F 2 " exactly W^(u) times. By Parseval’s equality the matrix B 
contains exactly 2^” rows. By Xiao Guo-Zhen-Massey spectral characterization 
[9] each row of the matrix B contains at most n — m — 1 zeroes. It follows that 
the total number of zeroes in B is at most (n — m— 1)2^". Therefore, there exists 



some zth column in B that contains at most 



n 



zeroes. By construction 



it follows that ^ W^{x) < — . Then by Theorem 3 we have 

xGF2^ 



Z\/(ei) = -2”+2i-” Y, Wf{x) <-2^ + ^ — - — 5^2”+^ < ^ 

Xj^=0 



- 2 ) 
n 



It follows that Af> 2". □ 

In the next theorem we improve the lower bound of Lemma 2. 

Theorem 4. Let f be an m-resilient Boolean function on ■ Then Af > 

( 2m— n+3 \ on 
n+l )^ ■ 

Proof. Suppose that in the proof of Lemma 2 the matrix B contains exactly 
/i2^" rows with less than n — m — 1 zeroes. Then repeating the arguments from 
the proof of Lemma 2 we have 



Af> 



[ 2m — n + 2 + 2h\ 

{ -n — r 



At the same time it is not hard to see that 



= -2” + 2i-" Y ^/(^) 

|a: I =0 (mod 2) 



( 1 ) 



and 

Ay>|Ay(l...l)|>(l-2/z)2". (2) 

The right part in (1) is increasing on h whereas the right part in (2) is decreasing 
on h. The right parts in (1) and (2) are equal when h = . Therefore, 

2". □ 
In [19] Zheng and Zhang proved that for balanced mth order correlation 
immune function / on T 2 ” the bound Af > 2 n.-m ._1 holds. It follows that Af > 
2™ + 2. Our Theorem 4 improves significantly this result for m > {n — 3)/2. 
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4 Upper Bound for the Number of Nonlinear Variables 
in High Order Resilient Functions 

In this section we prove the new upper bound for the number of nonlinear vari- 
ables in high order resilient functions. 

The next lemma is well-known. 

Lemma 3. [11] Let f be a Boolean function on T 2 ”; deg(/) > 1. Then 

2n-deg(/) ^ < 2” - 

The next lemma is obvious. 

Lemma 4. Let f be a Boolean function on F 2 ", deg(/) > 1. Then deg{f{x) 0 
f{x + ei)) < deg(/(x)) - 1. 

Lemma 5. Let f be a Boolean function on , deg(/, Xj) > 2. Then 

W]{u) > 22"-deg(/)+b 

Ui=0 

Proof. By Theorem 3 using Lemmas 3 and 4 we have 

_2" + 2i-" ^ wf{u) = Af{ei)= Y = 

u€F2^ xGF2^ 

Ui=0 

2" - 2wt{f{x) © f{x + Ci)) > 2” - 2 ( 2 " - 2 ”-(deg(/)-i)^ = _ 2 " + 

It follows that X; VL?(u) > 22"-deg(/)+i. □ 

“t=0 

Theorem 5. Let f be an (to = n — k) -resilient Boolean function on F 2 ”, k>2, 
and deg(/, Xi) > 2 for each i = 1, . . . , n. Then n < {k — 1)2'^®®^-^)“^. 

Proof. We form the matrix B with n column writing in rows of B each 
binary vector u G F 2 " exactly W^(u) times. By Parseval’s equality the matrix B 
contains exactly 2^” rows. By Xiao Guo-Zhen-Massey spectral characterization 
[9] each row of the matrix B contains at most k — 1 zeroes. It follows that the 
total number of zeroes in B is at most (k — 1)22". gy Lemma 5 each column 
of B contains at least 22 "“‘^®s(/)-(-i zeroes. Therefore n < = (k — 

l)2deg(/)-l. □ 

Theorem 6. Let f be an (to = n — k) -resilient Boolean function on F 2 ”, k >2, 
and deg(/, xf) > 2 for each i = 1, . . . , n. Then n < {k — 1 ) 2 ^“ 2 ^ 

Proof. By Siegenthaler’s Inequality [13] we have deg(/) < fc — 1. This fact 
together with Theorem 5 follow the result. □ 

In [16] it is proved that n < (fc— 1 ) 4 ^“ 2 ^ Our Theorem 6 improves significantly 
this result. Note that there exists (n— fc)-resilient function on F 2 ", n = 3-2'^~'^—2, 
that depends nonlinear ly on all its n variables (see constructions in [14]). 
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Corollary 1. Let f be an m-resilient Boolean function on F 2 ". Ifn> (n — m — 
1)2"-™-2 then Af = 2”. 

Proof. If n > {n — m— 1)2"“"*“^ then by Theorem 6 the function / depends 
on some variable linearly, hence, Af = 2". If n = {n — m — 1)2"“™“^ and / 
depends on all its variables nonlinearly then according to the proofs of Theorems 
5 and 6 we have that each row of the matrix B contains exactly n — m—1 zeroes. 
But in this case \Af{l . . . 1)| = 2", so, Af = 2”. □ 

5 Resiliency Orders of Quadratic Functions 

In the next two sections we apply the autocorrelation coefficients for the analysis 
of quadratic Boolean functions, i. e. functions with algebraic degree 2 in each 
variable. 

Lemma 6. For any Boolean function g on the function f{xi, X 2 , X 3 , 

. . . , Xn) = g{xi 0 X 2 , X 3 , . . . , Xn) 0 Xi is balanced. 

Proof. We combine all vector from T 2 " into pairs {y' , y") such that y' and y” 
differ only in first and second components and coincide in all other components. 
Then f{y') = f{y'')® land wt{f)= Y. (/( 2 /O + /(2/")) = 2""^ □ 

(y',v") 

Lemma 7. For each function g{yi,...,y„) on F 2 " the function f{xi, ..., 
X 2 n) = g{xi(Bx„+i,X 2 (BXn+ 2 , ■ • ■ , Xn®X 2 n)®xi®X 2 ®. . .®Xn is {n-T)-resiUent. 

Proof. Consider an arbitrary subfunction /' obtained from / by substitution 
of n — 1 constants for some n — 1 variables. Then there exists j such that both 
variables Xj and Xn+j remain free. Then /' has the form /' = g'{...,xj 0 
Xn+j , . . .) 0 Xj and by Lemma 6 the function / is balanced. Hence, / is (n — 1)- 
resilient . □ 

Theorem 7. Quadratic m-resilient functions of n variables exist if and only if 
m < f — 1. 

Proof. Substitutuing to Theorem 5 the value deg(/) = 2 we have n < 
2{n — m — 1). It follows that m < f — 1. Now suppose that m < f — 1. Consider 
the function f{xi, ..., X 2 (n-m-l)) = g{xi 0 Xn-m, X 2 0 Xn-m-ei, • • • , Xn-m -1 0 
X 2 (n-m-i)) 0 xi 0 X 2 0 . . . 0 Xn-m -1 where g is some quadratic function. By 
Lemma 7 the function / is a (2(n— m— l))-variable (n—m — 2)-resilient quadratic 
function. It is easy to check that if we substitute some constants for the vari- 
ables Xn+i, . . . , X 2 (n-m-i) / then we obtain a desired n-variable m-resilient 

function. □ 

6 Complete Description of Quadratic Boolean Functions 
with Maximum Resiliency Order 

In this section we give a complete description of quadratic resilient Boolean 
functions that achieve the bound m < — 1. It is obvious that for such functions 

n is even. Therefore in this section we consider for convenience (JV = 2n)-variable 
(m = n — I)-resilient functions. 
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Definitions. The notation f{xi,...,Xn) = g{xi,...,Xn) means that the 
Boolean functions f and g are equal up to permutation of indices of variables. 

Theorem 8. Let f be an (TV = 2n)-variable {m = n — l)-resilient quadratic 
function. Then Wf{u) 0 only if |tt| = n. 

Proof. By Theorem 1 we have that Wf{u) ^ 0 only if |u| > to + 1 = n. 

We form the matrix B with N columns writing in rows of B each binary 
vector u G F 2 ^ exactly W‘j{u) times. By Parseval’s equality B contains exactly 
2^^ = 2"^" rows. Each row has at most n zeroes, therefore the matrix B contains 
at most n2‘*” zeroes. 

On the other hand, by Lemma 5 each column of the matrix B contains at least 
22 Af-deg(/)+i _ 24n-i 2 eroes, i. e. the matrix B contains at least = n2^” 

zeroes. 

Thus the matrix B contains exactly n2‘*" zeroes and each row of B has 
exactly n zeroes and n ones. □ 

Lemma 8. Let Cpq = (0, . . . , 0, 1, 0, . . . , 0, 1, 0, . . . , 0) G F 2 ", p q, and f is a 

p 1 

quadratic function on E 2”. Then Af(epq) G {0,±2”} and the next statements 
hold: 



Af{epq) = 2 ” 4=^ f{x) = g{. . . ,Xp ® Xq, . . .), g is quadratic, 
Af{epq) = — 2 " 4=^ /(x) = g(. . . , Xp 0 Xg, . . .) 0 Xp, g is quadratic . 



Proof. 

We write the function / in the form /(x) = 0 aijXiXj 0 0 biXi 0 c 

l<2<j<n l<2<n 



where = aji and an = 0. 

{Upi 0 aqi'jXi 0 apq{Xp 0 Xq 0 1) 0 bp 0 bq 

Then Af(epq) = X) 
a;eF2" 



If the expression 0 (api0a ,i)xi 0 apg(xp 0 Xg 0 l) 06 p 06 g coutaius at least 

one linear term Xk then we have Af(epq) = 0. If this expression does not contain 
linear terms, it means that Opi = Uqi for all i. Then Aficpq) = 2"(— and 
the function / can be represented in the form 

fi^)= 0 aijXiXj® 0 &iXi0c0( 0 apjXj0 6, j(xp0x,)0(&p0 6q)xp, 

■^<3 / 

that completes the proof. □ 



Theorem 9. Let /(xi, . . . ,X2 n) be a quadratic function on 02^". Then 



Af(epq) = — n2^" if and only if 

l<p<q<2n 



f = g(xi 0 X„+1, . . . , X„ 0 X2„) 0 Xi 0 . . . 0 X„ 
where g{yi , . . . , yn) is a quadratic function on F 2 ”. 
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Proof. Consider an arbitrary quadratic function / on F 2 ^"‘. At the set of 
vertices V = {1, . . . ,2n} we construct the graph G = (V,E) by the next rule: 
(p, q) G E ii and only if Af{epq) yf 0. 

Each connected component H* = of this graph is a complete graph 

since by Lemma 8 we have that (p, q) G E* if and only if Opi = Qqi for all i. 

We divide E* into two subsets Vf U Vf such that i G Vf. Let us denote 
:= |Eo‘|, v{ := \Vf\. 

Then for p and q from the same subset of V* by Lemma 8 we have Af(epq) = 
2^” and for p and q from different subsets we have Af{epq) = —2^”. 

Let us estimate the sum 



^ Af{epq) = 2^- 
{p,q)£E* 




= 2^ 






h _ 2-u‘uJ = 

-{vl + v\))>-2^-^\V\ 



The equality is achieved only for Vq = v\ = v*. 

Hence, ^ Af(epq) > —2^”“^^ \V*\ = — n2^”, moreover, the equality 

l<p<g<2n t 

is achieved only if Vq = v\ for all t. 

Thus, if we have the equality ^ Af(epq) = — n2^” then it is possible to divide 
the set of all variables into the pairs {i\,j]f) where i]. G Vq, jf. G Vf. Then the 
function will be represented in the form f(xi, . . . , X2„) = pfxp 0 Xai , . . . , xp 0 
Xji^ , . . .) 0 Xji 0 ... 0 Xji^ 0 . . ., i. e. in desired form. 

Now suppose that the function has the form g{xi 0 x„+i, . . . , x„ 0 X 2 n) 0 
xi 0 . . . 0 Xn- Then after the construction of the graph G and the partitioning 
it into components, we have i G Vf. and i + n G Vj,* 01 for all i, i < n. It follows 
that Vq = v\ for all t. □ 

Theorem 10. Let f{xi , . . . , X 2 n) be an {2n)-variahle {n — l)-resilient quadratic 
function. Then there exists a quadratic function g{yi, . . . ,yn) such that 



/(xi, . . . ,X 2 „) = g{xi 0 Xn+l, . . . ,X„ 0 X 2 ri) 0 Xi 0 . . . 0 X„ 



Proof. Substitute the equation from Theorem 3 into Theorem 9: 

E ^/(ep,) = E(-2^" + 2'-^” E ^/(^) 

l<p<g<2n p<q ^ 

<x,epq>=0 (mod 2) 

= -n(2n- 1)22” 0 2^-2"^ ^ IE|(x). 

p<q Xp=Xq 

By Theorem 8 for |x| yf n we have Wf{x) = 0, hence 



E E ^/(^) = E E ^/(^) = E 

p<qXp—Xq p<q ^p-^q \x\—n 

\x\=n 



Wf{x) 



E 



1 = 



p<q : Xp=Xq 



= (n 2 - n) E ^/(^) = - ^) 2 ^”- 

\x\—n 
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Therefore, 

^/(epg) = -n{2n - 1)22” ^ 2i-2”(n2 - n)2^^ = -n2^^ . 

l<p<q<2n 

It follows by Theorem 9 that all (2n)-variable (n — l)-resilient quadratic 
functions have the given form. □ 



7 Nonexistence of Unbalanced Nonconstant mth Order 
Correlation Immnne Boolean Functions on F 2 ^ for 
m > 0.75n — 1.25 



In this section we prove that unbalanced nonconstant mth order correlation 
immune Boolean functions on T 2 " do not exist for m > 0.75n — 1.25. Similar 
statements are known for multioutputs functions (see [2], [10]) but for usual 
Boolean functions until now statements of such type were not formulated even 
as conjectures. 

Theorem 11. Let f be an arbitrary Boolean function on F 2 ". Let w G T2"\{0}- 
Then 

^ 1T|(x) = 2”-H ^ Af{u). 

xGF2^ uGF2‘^ 

<x,w>=0 u:<w 

Proof. Summing Af(u) over all u, u Aw, by Theorem 3 we have 



I] ^/(U= ^ 



u,eF2'" 



\i^F2'^ 

u<xu 



n I rfl — n 



-2” + 2 



V 



E = 



<a;,ii> = 0 (mod 2) 



_ 2 "+i“i + 21 -” E = 



^eF2^ xGF2'^ 

<x,tt>=0 (mod 2) 






2l”'l Wf{x) + 2^^^-'^ = 



V 



xGF2^ 
<x ,-w>=0 



x€F2^ 

<x,m> >0 



/ 






2l»|-i . 22« _^2l”'l-i W(f(x) =2l”'l-” 



V 



xGF2'^ 
dx ,wA>=0 



xeF2^ 

<x,m>=0 



Theorem 12. Let f be an arbitrary Boolean function on ■ Then 



□ 



^ Af{u) = Y^{2\-\-2wt{n)' 

tf=Fo^ f' 



where the last sum is taken over all 2" subfunctions f of [■u;| variables ob- 
tained from f by substituting constants for all Xi such that Wi = 0. 
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Proof. 



^ + (2l“l 

/' 



E ^/(“) = 1 : 1 : (- 1 )" 



x)+f(x+u) 



u^w 



XGF2 

u^w 



E E = E E 

^’V of /' 

wtinr - 2wt{f){2\^\ - wt{f)) = ( 2 '“' - 



Corollary 2. Let f be an arbitrary Boolean function on F 2 ". Then 



E 

x^F 2 '^ 

<x,in>=Q 



W]{x) = 2”-I“I+2 ^ 
/' 




□ 



where the last sum is taken over all 2" l™l subfunctions f of |'u;| variables ob- 
tained from f by substituting constants for all Xi such that Wi = 0. 

Proof. It follows immediately from Theorems 11 and 12. □ 

Remark 7. If / is an (n — /c)th order nonaffine correlation immune Boolean 
function on F 2 " then by (2) we have lT/(0) = 0 (mod 2"“*+^). Therefore 
lT/(0) = 2”“* (mod 2”“*+^) for some i, i G {1, 2, . . . , fc — 1}. 

Theorem 13. Let f be an unbalanced nonconstant (n — k)th order correlation 
immune Boolean function on . Let lT/(0) = ±p • 2"“* where p is some odd 
positive integer, z G {1, 2, . . . , fc — 1}. Then 

C;)^(2»-y,(‘-‘). ,3) 

Proof. By Lemma 1 we have that |2”“^ — wt{f)\ = p ■ 2”“®“^. Let w G F 2 " 
be an arbitrary vector such that |zc| = i. Then 

^ |2*-i - wt{f)\ > |2"-i - wt{f)\ = p ■ 2"-*-i 
/' 

where the sum is taken over all 2"“* subfunctions f of i variables obtained from 
/ by substituting constants for all xi such that Wi = 0. All terms in the sum are 
integer. It follows that 

^ ( 2 - - „A(nf > ((^)' + (^)“) - 2"--. 

Therefore by Corollary 2 we have 

^ Wf{x) > 2”-*+2 . [ = (p2 + 1 ) . 22"-2y 
xeFs" V 2 / 

<x,iv>=0 
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Hence, 

(4) 

x€F2^ 

<x,w>=0 

Next, we form the matrix B with n columns writing in rows of B each 
binary vector x G ^ 2 ” exactly IV^(x) times. By Parseval’s equality the matrix 
B contains exactly 2^” rows. The total number of nonzero rows of B is 2^” — 
p2 By Xiao Guo-Zhen-Massey spectral characterization [9] each nonzero 
row of the matrix B contains at most k — 1 zeroes. It follows that each nonzero 
row in B contains at most (^7^) subsets of i zeroes. All nonzero rows in B 

contain at most (22” — * zeroes. At the same time 

by (4) for any i columns in B there exist at least 22 "“ 2 * nonzero rows that 
contain only zeroes in these i columns. Therefore, 

{ c)2n „2 c)2n—2i\ ( fc— 1\ 

^ HO > 

22n-2i ~ \ i ) ' 

□ 

Corollary 3. Let f be an mth order correlation immune Boolean function on 
F 2 ". Let wt{f) = u ■ 2^ where u is odd positive integer, h is integer. Then 




< u(2”-'* 




Proof. It follows immediately from Theorem 13 and Lemma 1. □ 

Theorem 14. Let f be an unbalanced nonconstant (n — k)th order correlation 
immune Boolean function on . Then n < 4/c — 5. 

Proof. By Remark 1 we can assume that W/(0) = 2”“* (mod 2"“*+^) for 
some i, z G {1, 2, . . . , fc — 1}. Then by Theorem 13 we have 

n{n — 1) . . . (n — z + 1) < (2^* — l)(fc — l)(fc — 2) . . .{k — i). (5) 

Suppose that n > 4(fc— 1). Then rz(rz— 1) . . . (n— z+1) > 2‘^’‘{k—l){k—2) . . . {k—i) 
that contradicts to (5). □ 

Corollary 4. For m > 0.75n — 1.25 there do not exist unbalanced nonconstant 
mth order correlation immune Boolean functions on F 2 ". 

It is easy to check that the 3-variable function / that takes the value 1 only 
at two vectors (0, 0, 0) and (1,1,1) is correlation immune of order 1. Therefore 
the bound in Corollary 4 is tight. 

Remark 2. Until now Bierbrauer-Friedman bound [8], [1] 



wt{f) > 2” 



2(m +1) — n 
2(m -I- 1) 



( 6 ) 



was the best known lower bound for the weight of high order correlation immune 
nonconstant functions. If we substitute m > 0.75n— 1.25 to (6) we obtain wt{f) > 
2” gTT-i ■ In fact, our Corollary 4 follows that in this case wt{f) = 2"“^. 
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8 Tradeoff between Correlation Immunity and 
Nonlinearity for Unbalanced Boolean Functions 

In [12] Sarkar and Maitra proved (this result was obtained independently also 
in [14] and [18]) that for an n- variable mth order correlation immune Boolean 
function f , n — m > 1, the inequality nl{f) < 2"“^ — 2™ holds. Moreover, if / is 
balanced (i. e. m-resilient), n — m > 2, then nl{f) < 2”“^ _ 2 ™+i^ jjj Zheng 
and Zhang proved that for unbalanced Boolean functions, m > 0.6n — 0.4, the 
nonlinearity 2"“^— 2™ can not be achieved. Therefore for an n- variable mth order 
correlation immune Boolean function /, 0.6n — 0.4 < m < n — 1, the inequality 
nl{f) < 2”“^ — 2™+^ holds. (Note that by our Corollary 4 for m > 0.75n — 1.25 
unbalanced n-variable mth order correlation immune functions do not exist at 
all!) At the same time in [15] Tarannikov gives the constructions of n-variable 
m-resilient Boolean functions with the nonlinearity 2"“^ — 2™+^ for 0.6n — 1 < 
m < n — 2. Thus, although the upper bound in [12] for unbalanced functions is 
higher than for balanced, nevertheless, at least for 0.6n — 0.4 < m < n — 2 the 
maximum possible nonlinearity of m-resilient Boolean functions is not less than 
the maximum possible nonlinearity of mth order correlation immune unbalanced 
Boolean functions. In this section we continue the investigations in this direction 
and give new improvements of upper bounds for the nonlinearity of high order 
correlation immune unbalanced Boolean functions. In our investigation we use 
the inequality (3) obtained in Theorem 13. 

Theorem 15. Let f be an unbalanced mth order correlation immune function 
on F 2 ”. Suppose that Wf{0) = 2"*+^ (mod 2™+^). Then for n > 12 the in- 
equality 

1 1 , 

m < -n -I — logo n const 
2 2 

holds where const = | log 2 (f — 1. 

The proof of Theorem 15 is given in the Appendix B. 

Corollary 5. Let f be an unbalanced mth order correlation immune function on 
F 2 ". Lfm> \n-\- i log 2 n-\- 1 log 2 (f e®/®) — 1, n > 12, then nl{f) < 2”~i — 2"^+^. 

Proof. By Theorem 15 we have W/(0) ^ 2™“*'^ (mod 2™+®). It follows that 
|1T/(0)| > 2™+2. Therefore, nZ(/) = 2"-i-i max^ \Wf{x)\ < 2”-i-i|lT/(0)| < 

2 "-i - 2 ™+b □ 

Theorem 16. Let f be an unbalanced mth order correlation immune function 
on F 2 ”. Suppose that Wf{0) = 2™+® (mod 2™+®). Then for n > 24 the in- 
equality 

13. . /I 1 ' 



m < -n 
2 



+ X log 2 n + log 2 7 + - + const 



holds where const = | log 2 (f e®/®) — 2 . 

The proof of Theorem 16 is given in the Appendix C. 

Corollary 6. Let f be an unbalanced mth order correlation immune function 
log 2 n -k log 2 (7 + 7 ) + 5 log 2 (f e®/®) - 2, n > 24, then 



on 



^^ 2 ". If m> -k I 
nl{f) < 2”-i -2™+®. 
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Proof. By Theorems 15 and 16 we have that |W/(0)| > 2™+^. Therefore, 
nl(/) < 2"-i - i|lT/(0)| < 2”-i - 2™+2. □ 

Thus, we see that although the upper bounds in [12] for the nonlinearity 
of unbalanced functions is higher than for balanced, nevertheless, for higher m 
balanced functions are ’’better” than unbalanced in this respect. 

The authors are grateful to Oktay Kasim-Zadeh for valuable advices on the 
analysis of inequality (7). 
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A Proof of Theorem 3 



If M = 0 then obviously Z\/(m) = 2”, and ^ W^{x) = ^ W^{x) 

x€F2^ xGF2'^ 

<x,ti>=0 (mod 2) 

2^”, therefore, the equality holds. So, we can assume that u yf 0. Next, 



E «'/(») = E E <-1)-'"'’+ 

xeFa" X6F2" \yeF2" 

<x,u> = 0 (mod 2) <x,u>=0 (mod 2) ' 



<x,y> _ 



E b"+ E 

y'^y"GF2'^ 



xGP 2 '^ 

<.x ,u>=0 (mod 2) 






,v'+y"> _ o2n-l 



= -b 






)+f(v") 



E + E (-1) 



<x,u+y' +y"> \ 



y'A!/"eF2" 



\XGF 2 '' 






+ b E ( 0 + E M = 



y' ,y"€F 2 '^' 
y' + y"=u 



xGF2^' 



22"-i -b 2”"1 E (-l)-^(^)+-^(^+“) = 22”"^ -b 2 ""M/(m). 

V&F 2 " 



□ 
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B Proof of Theorem 15 

For i = fc — 1 = n — m — 1 in (3) we have 



For each i,0 < z < n, we have (”) > (f)*- It follows that if for some i the 
inequality (7) holds then the inequality (")* < 4* holds too. Therefore f < 4 
and f < i. Thus, we obtain the simplest bound on z: z > f . 

By means of the lower and upper bounds for n\ (see[7]) 

v^zz”+i/2e-”e(^2«+i)-i ^ ^ V^rz”+^/2e-”e(i2n)-i 

it is easy to deduce the inequality 



12n^(l--2^) 



/27m^(l- b) 

n.^ T), ' 



that holds for any 0 < z < zz. 

(Here H{x) = — xlog 2 x — {1 — x) log 2 (l — x) is the entropy of x, 0 < x < 1). 
If f < z < f then q < ^ < 

Consider the function cc(l — a;). If 1/4 < x < 1/2 then 3/16 < x(l — x) < 1/4. 
It follows that for j < z < ^ we have 



<- 1 - 



16 n 



-( 1 - -) 
n v n ^ 



> 4 and 



/±(1_ ±) 

n v n ^ 



/2ir^(l - 

n n ■' 



1 ^ 16 
Tn^rT) < Y’ 

n v n / 



. , „ 1 16 4 4 

it follows ^ ^ < = — < 

12n^(l-^) 12rz-3 9n ~ 9 

n V n ■' 



since rz > 1. 
Therefore, 






> e 9 . 



From (8) using (9) and (10) we have for any z, ^ < z < ^, that 



[2 _4 2^(sF 
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The inequalities (7) and (11) follow the inequality 



4 * > 



2e-|2^(±)n 

7T v n / 



( 12 ) 



Taking the logarithm in (12) we have 



, i\ 1 

2i > H [ — n — - logo n + a 
n J 2 



where a = 




■ 4/9 



^ . Dividing by n we have 




n 



> H 




1 

2n 



log 2 n + 



a 

n 



Denoting x = ^ we obtain the inequality 



1 Q; 

2x < H{x) - — log 2 n+ 

2n n 

or 

H{x) — 2x < a{n) (13) 

where a{n) = ^ log 2 n - 

Thus, the problem is reduced to the obtaining of lower bound for x satisfying 
(13) under the condition 1/4 < x < 1/2. 

Now put y = ^ — X. Then conditions on x : 1/4 < x < 1/2 transform into 
conditions on y : 0 < y < 1/4. To find the lower bound for x satisfying (13) is 
the same as to find the upper bound for y satisfying 






or 



^ [ 2~y) ^ 



(14) 



By Taylor’s formula 






(15) 



where ^ is some number from the interval 1/2 — y < ^ < 1/2. Taking into account 
that y < 1/4 we have 1/4 < ^ < 1/2. 

We differentiate and find that H'{x) = log 2 H"{x) = —i^ x{i-x) ■ 

It follows H'{\) = 0, also for 1/4 < ^ < 1/2 the inequality H"{\) < H"{^) < 
H"{\) holds, in particular, H"X) > — 3 ^ (the function increases for 

0 < X < 1/2). Also we take into account that H{\) = 1. 
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From (15) we have for any y, 0 < y < 1/4, 



H(\-y]>H 



- H' 



.A 



y + H" 



2 _ 1 ^ 2 
^ “ 31n2^ ■ 



Taking into account the last inequality in (14) we have 



or 

0 < -2y + a(n). (16) 

The inequality (16) is quadratic with respect to y and depends on the parame- 
ter n. The coefficient in quadratic term is positive, therefore y can be determined 
from the conditions y < yi or y > y 2 where yi < t/2 are roots of characteris- 
tic equation. The second condition is irrelevant and does not correspond to the 
sense of this problem. A discriminant is equal to 



Note that < 1, it follows a < 0. Let l3 = —a > 0. Then a(n) = 

^ logs n + ^ where /3 > 0. 

Thus, it is sufficient to solve the inequality 

0 < — 2y + bin) (17) 



where 7= 

Positiveness of a discriminant means that 1 — 76(71) > 0 or 6(n) < I/7, i. e. 



1 

2n 



logs n + - < 
n 



1 

7' 



(18) 



The function ^ has the maximum for x = e. Let n > 12, then ^ log2 n < 
^ log2 12. Hence, it is sufficient to demonstrate that 



1 



log2 12 + 




< 



3 In 2 
8 



or 

i(2-hlog2 3) -k ^(log2 < 31n2. (19) 

The right part of (19) is greater than 2 since < 8 = Consider the 

left part of (19). It is equal to 



2 

3 



logaS 




< 




3 
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The product ttc < 10, therefore ^ < 16. Hence, the left part of (19) is less 
than 2. It follows that for n > 12 a discriminant of the equation (17) is positive 
and required upper bound for y follows from the inequality 



1 - 



y <yi = 



1 - ^ 1-1 



8 

3 In 2 



a(n) 



8 

3 In 2 



8 

3 In 2 



= a{n) 



where yi is a root of the equation correspondent to the inequality (16). Pointing 
in a view that V=\— ^=\ — ^ = \~ ^ we have: 



m + 1 1 1 

n 2 ^ 2n 



logj„+Alog,(^e'-'»). 



For n > 12 it follows 



m < -n + - logo n + const 
2 2 

where const = ^ log 2 (f e®/®) — 1. 



□ 



C Proof of Theorem 16 

For i = k — 2 = n — m — 2 m (3) we have 

(”) <(T-l)(z + l). 



( 20 ) 



As in the previous proof we use the inequality (8), the bounds (9) and (10) 
valid for sufficiently high n and the inequality (11). Combining (11) and (20) we 
have 

f2 4 2 ^(")” 



4*(t + 1) > Y -e 9 
Taking the logarithm in the last inequality we have: 



2 —4 
f e 9 



2i + log2(i + 1) > 

)■ , 

vari 

log2(a:+ i) 



n + a, 



a = log 2 

Introducing new variable x = - and dividing by n we obtain 



2x 



> H{x) — 



3 log 2 n a 



n ' 2 n n 

Taking into account that log 2 (a; + ^) > log 2 (| + ^) we have: 

H{x) — 2x< a{n) 

where a{n) = 
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This inequality is analogous to the inequality (13); the only difference is in 
the function a(n). Using the reasonings completely analogous to the reasoning 
in the previous proof we deduce the inequality 



0 < — 2y + b(n) 

"^here 7 = y = | - x, b(n) = /^ = “«• 



(21) 



The solutions of this inequality satisfy y < a{n) (see Appendix B). It means 
that i - + 310 ^ _ ^ rewriting 



1 



w < xn + - log 2 n + log 2 7 + - + c, 



1 1 



c = —a — 2 . 



Now we need only to know for which n this inequality is satisfied, i. e. begin- 
ning with which n a discriminant of inequality ( 21 ) is nonnegative, or b{n) < I/ 7 , 
or 

31og2n , log 2 (i + 7 ) , log 2 (A/fe"*/®) ^ 31n2 



2n 



< 



n 



Computer analysis shows that this inequality is true beginning with n = 24. 
It completes the proof. □ 
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Abstract. We present an algorithm for counting points on snperelliptic 
curves ff = f(x) over a finite field Fq of small characteristic different 
from r. This is an extension of an algorithm for hyperelliptic cnrves due 
to Kedlaya. In this extension, the complexity, assnming r and the genus 
are fixed, is 0(log®'*'^ q) in time and space, just like for hyperelliptic 
curves. We give some numerical examples obtained with our first imple- 
mentation, thus proving that cryptographic sizes are now reachable. 



1 Introduction 

In the past few years a lot of candidates have been proposed to enlarge the set of 
groups that can be used in protocols based on the discrete logarithm problem like 
Diffie-Hellman or ElGamal. Beside the classical multiplicative groups of finite 
fields, the most famous are certainly the systems based on elliptic curves [21, 
26]. Indeed, for these systems the only general attacks known are variants of the 
Pollard Rho method which require exponential time computation; in practice it 
means that the key size is much shorter than in a system that uses finite fields. 
Thereafter, systems based on hyperelliptic curves were proposed [22] . They seem 
to have the same advantages as elliptic curve cryptosystems (at least when the 
genus is less than 4 [1,14]). 

More recently, systems based on the discrete logarithm problem in the Jaco- 
bians of other curves were designed. Namely, in the literature, we can now find 
algorithms for working in Jacobians of superelliptic curves [13] and of Cab curves 

[2] . Several works related to these curves have already been published, concerning 
security issues [4], efficiency [17,6], building curves with known number of points 

[3] , or possible use in a Weil restriction attack on elliptic curves [5]. The next 
step for studying the possible cryptographic use of these curves is to conceive an 
algorithm for counting points of the Jacobian of a random curve. Indeed, this is 
thought to be one of the most secure ways of building a cryptosystem by a large 
part of the community. 

In the case of elliptic curves, this problem of point counting has been a chal- 
lenge of the past 15 years and nowadays we have satisfactory solutions. When 
the characteristic of the base field is large the best known method is Schoof’s 
algorithm and all the improvements leading to the so-called Schoof-Elkies- Atkin 
algorithm. We refer to [7] or [23] for surveys of these techniques and to the ref- 
erences therein. Besides some theoretical results [29] and an attempt to make 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 480-494, 2001. 
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them practical [15], extending the SEA algorithm to higher genus has not yet 
proven to be enough for cryptographic sizes. The situation is quite different in 
small characteristic: two years ago, Satoh [32] showed that p-adic methods using 
the canonical lift could lead to an algorithm asymptotically faster than SEA. 
Some work has been done consequently on the subject to extend it to charac- 
teristic 2 [33,9], to implement it and obtain new records [9], to use less memory 
[34] , and to combine it with an early-abort strategy for generating secure curves 
[10]. Mestre, Harley and Gaudry recently proposed a related algorithm, based 
on the arithmetic-geometric mean, for elliptic curves and hyperelliptic curves 
of genus 2 in characterstic 2; a nice feature of this technique is that it does not 
explicitly make use of j-invariants, of modular equations nor of Vein- type formu- 
lae, and these had previously been the main obstructions to generalizing beyond 
the elliptic case. However, the AGM method does not seem to extend easily to 
non-hyperelliptic curves. Another approach, also using p-adic methods but not 
based on canonical lifting, has been proposed by Kedlaya [18]. His method ap- 
plies to hyperelliptic curves in small odd characteristic. The complexity in time 
is 0(log^~''^ q), for curves over of fixed genus, i.e. the same as all the variants 
of Satoh’s method and the complexity in space is 0(log^ q) which is the same as 
Satoh’s original algorithm, but bad compared to the algorithm of [34] or AGM. 

The contribution of this paper is twofold: firstly we show that Kedlaya’s al- 
gorithm can be extended in a rather straightforward way to superelliptic curves; 
secondly we report some results obtained with our first implementation writ- 
ten in Magma. To our knowledge, these are the first published point counting 
computations for random hyperelliptic and superelliptic curves of cryptographic 
sizes. 

The paper is organized as follows: after recalling some basics about curves 
and p-adic numbers, we describe Kedlaya’s original algorithm and show how to 
adapt it for superelliptic curves. Then we give some more details on the way 
these algorithms can be handled in practice and we estimate the complexity. We 
conclude by numerical examples and remarks about the use of these curves in 
cryptography. 



2 Background on Algebraic Curves and p-Adic Number 
Rings 

In this section, we recall some basic facts about algebraic curves over finite fields 
and p-adic numbers. We shall not give precise definitions and we refer the reader 
to classical books on the subject ([12,20,19,24] for instance). 



2.1 Hyperelliptic and Superelliptic Curves 

Let Fq be a finite field with 9 = p" elements. We shall consider only two types 
of curves over F^, namely hyperelliptic and superelliptic curves. 




482 P. Gaudry and N. Giirel 



Definition 1 A superelliptic curve is a plane curve C which admits an affine 
equation of the form 

y'" = fix), 

where r is a prime different from p and f is monic, squarefree of degree d coprime 
to r. 

With such a definition, C is non-singular in its affine part, and admits a unique 
place of degree 1 at infinity. Moreover its genus is given hy g = 

Definition 2 In characteristic different from 2, a hyperelliptic curve is a su- 
perelliptic curve whose equation is of the form = f{x), with r = 2 and f of 
degree 2g 1. 

Note that there exists a more general definition of hyperelliptic curves which 
do not exclude the case of characteristic 2. But the algorithms we will describe 
work only for this particular case. 

Let C be a superelliptic curve of genus g. Associated to this curve, one can 
define its Jacobian, noted J(C), which is a finite abelian group. In the past few 
years, several algorithms were developed to compute explicitly in this group [13, 
2,17,6]. The next step is to study the order of J(C). For this the g-th power 
Frobenius endomorphism and its characteristic polynomial x('T) are key tools. 
More precisely, x(T') can be written as 

2g 

x{T) = ^aiT\ with a 2 g = I, ai = q^~"'a 2 g-i for i = 0, . . . , ^ - 1, 

i^O 

and all its roots have absolute value ffiq. This is essentially the Riemann Hypoth- 
esis for zeta functions of curves. For us, the interesting fact is that #J(C) = x(l)- 
Our goal in this paper is to compute x(^) and to obtain #J(C) as a byproduct. 

2.2 The Ring Zg 

Let K be the (unique up to isomorphism) unramified extension of degree n of 
Qp; its residual field is Fg. We denote by Zg the ring of integers of K. In order 
to construct it, we can start with the polynomial Pft) which defines Fg as an 
algebraic extension of F^; we then consider the extension 

Zg := Zp[t]/(P(t)), 

where the polynomial P{t) is obtained from P{t) by lifting trivially its coeffi- 
cients to p-adic integers. In practice, an element z of Zg can be represented as 
a polynomial z = z„_it”“^ -I- z„_ 2 t"“^ z it zq taken modulo P{t) and 

where the z^ are integers modulo a power of p called the precision at which the 
computation is done. 

It can be shown that the Galois group of K over Qp is cyclic. We will denote 
by a the unique generator, also called Frobenius, of this Galois group that reduces 
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modulo p to the p-th power Frobenius in Fg. There is no trivial formula for 
writing for an element z in Zg expressed on a polynomial basis as above. 
Later on, we will describe how to precompute and then z'^ is obtained as 
follows: 

( n—l \ ^ n—1 

=Y^z,{n\ 

3 Kedlaya’s Algorithm and Its Extension 

3.1 Overview of Kedlaya’s Algorithm for Hyperelliptic Curves 

Let C be a hyperelliptic curve of genus g given by its equation y'^ = f{x) over 
Fg. Following the construction of Kedlaya (see also [20], page 72), we consider 
the curve C obtained from C by removing the point at infinity and the points 
with vertical tangent (i.e. y = 0). 

There is a way to lift the coordinate ring of C called the weak completion [27], 
with the nice property that its cohomology verifies a “Lefschetz trace formula” 
[28] and hence gives information about the cardinalities of the initial curve C. 

Taking a lowbrow point of view in which we can forget about the curve C , 
we shall work on the vector space generated over the p-adic number field K by 
the following differential forms: 

„ / x^dx . 

v = [o,2p-i]y , 

in which we have the relations coming from the equation of the curve and 
d(p{x, y) = 0 for every rational function tp. On the differential forms one can 
define a Frobenius action which is compatible with the p-th power Frobenius 
on C: take x'^ = x^, p'’' given by = f{xY and (dx)'’ = px^~^dx. Kedlaya 
shows in a constructive way that the space T> is stable under the action of this a. 
Hence a is an endomorphism of a vector space of dimension 2p; and everything 
is done in order for its characteristic polynomial to be closely related to the 
x(T) we are looking for. The heart of Kedlaya’s algorithm is then to compute 
the matrix of a for the given basis of T>. 

For each i in [0, 2g — 1], 




(T 



= — px^P+P-^dx, 

y<7 



therefore the tricky part is the computation of This is not defined in a lifted 
coordinate ring because it involves a square root and that is a reason why we 
use the weak completion. From a practical point of view, it means that we shall 
be able to expand ^ as a power series in r = starting with the definition 
have 
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= {f{xr-f{x)p + f{xrr^/^ _ 

= ^{l + rPifixy -f{xr))-^/^ . 

By the usual power series expansion of (1 + we get an expression of the 

form 

A = y-P ^ P,{x)tP'^ = Y. Pk{x)rP\ 

y k>0 k>0 

Note that p divides {f{x)°' — f{x)P) so that the power of p dividing Pk{x) tends 
to infinity as k grows (actually this is what is expected due to the theoretical 
construction of the weak completion). We can now write 




where Qk{x) are polynomials. The algorithm proceeds as follows: we compute 
this expression up to some precision in r, and then we use the relations in P 
described above to reduce the expression to a polynomial of degree at most 2g—l, 
times In this way we shall prove that T> is indeed cr-stable and moreover we 
obtain an explict description of the action of a on the basis. For this we will use 
three strategies of reduction: 



Red 1. First of all, using the equation of the curve, one can write 



Qk{x)T'" = {ak{x)f{x) + fik{x))r^ = afe(a;)r'' ^ + (3k{x)T^ , 



where and Pk are the quotient and the remainder in the division of Qk 
by /. Therefore one can assume that Qk{x) is of degree at most 2g for all k, 
except for Qo{x) for which one can show that the degree is at most 2pg — 1. 

Red 2. Then we use the relations of cohomology to rewrite the series in the 
form Q{x)^. Fix fc > 1 and consider the term Qk{x)T^^. Let U{x) and 
V{x) be such that Qk{x) = U{x)f{x) + V{x)f'{x) (they do exist because / 
is squarefree). Using 



one obtains 




Qk{x) 



k-l 



dx 



Repeating this for decreasing k’s, we can rewrite everything on the constant 
term of the series. 
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Red 3. Finally, in the expression Q{x)^ that we obtained, one can reduce the 
degree <5 of Q to at most 2g — 1 in the following way. Assume 5 > 2g: using 

d{x^-^Sy) = 0 , 

one gets a polynomial of degree <5 that can be subtracted from Q. 

At this point, we have computed a 2g x 2g matrix M such that 



/ dx \ 


<7 


/ dx \ 


I y ' 




y 




= M 




to 

1 




x^^~^dx 


\ V / 




\ y / 



Most of the operations done during the computation involve elements of Zg, but 
at the end we may have to divide by small powers of p. Finally the coefficients 
of M lie in p~‘^’Lq with a small, predictable s, which depends only on p and g. 
The final step is then to compute the characteristic polynomial of the matrix 

M 

which has coefficients in Z 2 and is a p-adic approximation of x(^)- 



3.2 Superelliptic Curves 

Let C be a superelliptic curve given by its equation = f{x) with / of degree d 
over Fg. The theory is exactly the same as for hyperelliptic curves. In the present 
case, the space of differential forms we consider is 



i G [0,d- 2], j G [l,r - 1]^ . 

The Frobenius action lifting the p-th power Frobenius on C is defined similarly: 
take x'^ = xP, y^ given by {y'^Y = f(xY and {dxY = pxP~^dx. 

Again, the space of differential forms has been chosen such that it is stable 
under the action of a; we will now describe the reduction process which allows 
us to rewrite over the basis. Fix an t G [0,d — 2] and a j G [l,r — 1]. 

We can write as a power series 







1 + 



KY’" - fjxY 

yrp 



-it'r 



fc >0 



where we have set t = y Hence we can write 




dx 

yjpmodr 



In the following, we let £ = jp mod r. We now proceed with three reduction 
steps similar to those we had for hyperelliptic curves. 
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Red 1. First, use the equation of the curve to obtain a series where the Qk{x) 
are of degree at most d — 1, except for the first one. 

Red 2. Then, rewrite the term in as a term in . For A: > 1, let U{x) and 
V{x) be such that Qk{x) = U{x)f{x) + V{x)f'{x), one has 

O.Wr'-* s (c/(x) + + . 

Red 3. Finally, we are left with an expression of the form Q{x)^, where Q{x) is 
a polynomial of degree 6 that we can reduce to degree at most d — 2: assume 
6 > d — 1, the exact differential d{x^~‘^~^^y^~^) = 0 gives a polynomial of 
degree S that can be subtracted from Q{x). 

We obtain a,2g x 2g matrix M and we conclude as before by taking the charac- 
teristic polynomial of its “norm” . 

Note that the differential forms in ^ are sent by a to the subspace generated 
by forms in ^ with £ = jp mod r. As a consequence, M is a matrix that can 
be viewed in blocks of size d — 1, with the property that there is exactly one 
non-zero block in each row block and each column block. 



4 Details and Complexity 



4.1 Precision of the Computation 

The intermediate result obtained from the algorithm of section 3 is an approx- 
imation of the polynomial x(T') that we are looking for, and by computing to 
sufficient precision we can determine it exactly. Two parameters have to be 
tuned, to ensure that at the end we get enough information to conclude. The 
first is the p-adic precision p‘' at which we truncate elements of Zp. The second 
is the T-adic precision at which we truncate the series. 

Bounds on the coefficients of x(T) can be deduced from the bounds on its 
roots: |ai| < for i G [l,p]. We assume that q is large compared to the 

genus, so that Og determines the required precision. Hence we need to know 
x(T) modulo be sure to recover all the coefficients. Therefore the 

working precision should be at least 



logp 






The precision in r is more problematic: at first sight it is not clear that we 
do not need all the terms of the series to get a result which makes sense even 
modulo p. Actually in the power series expansion, one can see that the coefficient 
in (which is a polynomial over Zg) is divisible by a power of p which grows 
to infinity at the speed of k/p. Hence it appears that the precision /x in r should 
be at least p times the p-adic precision v. Moreover, the reduction process also 
perturbs things: starting with a term Qk{x)T^^, with p™ dividing Qk{x), one 
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reduces to a differential form Q{x)^ and p™ does not divide Q{x) any more. 
In Lemma 1 of [18], Kedlaya shows that we can bound the loss of precision by 
logp{rk + £). Acordingly, it is sufficient to enlarge fj, slightly to ensure that at the 
end we have the required precision. A tedious calculation leads to the following 
choice for /r: we take the smallest /x such that 

^i>pv- -+p\og J{r + l)p - 1). 
r ^ 



4.2 Detailed Algorithm 

We summarize the algorithm in the following: 

Input: A superelliptic curve y'" = f{x) over Fg, q = p"^, the degree of / is noted 



Output: The characteristic polynomial x(T). 



1. Set the p-adic working precision v = |"logp(2(^^®)(7®/^)] and set the maximal 
precision p for the series to be the smallest value such that p > pv — ^ + 
plogp{{r + l)p - 1). 

2. Let S' = 1 -I- {f{xY — f{xy) tP, where f{x) is the polynomial f{x) where 
the coefficients are lifted arbitrarily from Fg to Zg. 

3. Compute S“^/’’ as a truncated series in t, to precision r^. 

For this, use a Newton iteration X ^ ~((^ + 1)-^ ~ SA’’+^), initialized with 
A = 1. At each step in the recursion, use Redl to keep the coefficients of 
the series of degree at most d — 1. 

4. Compute for j G [2,r — 1] up to precision t^. This is done by multi- 
plying by itself repeatedly; again, use Redl after each multiplication. 

5. For each i e [0, d — 2] and j G [1, r — 1] do 

a. Compute . 

b. Use Red 2 to write in the form Q(x) ^^ptfodr • 

During this reduction it is sometimes necessary to divide by an integer 
which is divisible by p. In theory, this ought to reduce the precision of 
the computation. Actually, when this occurs, one adds some arbitrary 
noise to “force” the precision to remain maximal. This strange way of 
doing things does not actually affect the final result because this noise 
will cancel out during the whole process. This is ensured by Lemma 1 in 
[18], which extends naturally to the superelliptic case. 

c. Use Red 3 to reduce the degree of Q{x). 

6. Compute the matrix M, its norm and its characteristic polynomial x(T) = 

J2o<k<2g ^kT^- 

7. For k G [ 1 , 5 ], find the integer Uk in [—\, W congruent to dk modulo p^. 
Return the corresponding x{T). 
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4.3 Complexity 

For the complexity analysis we shall make the following assumptions: 

— The characteristic p is fixed; 

— The parameters r and d of the curves are fixed, hence also the genus; 

— Each time we have to do a multiplication between two elements of a rather 
complicated structure (truncated series over polynomials over polynomials 
over integers), we assume that we pack everything into large integers and that 
we use Schonhage’s fast multiplication algorithm. A multiplication between 
two objects of bit-size N is then assumed to take time 

In Step 2 we have to apply Frobenius to some elements of Zg. For this, note 
that t being a root of P{t), so is P. Therefore, P can be obtained by a Newton 
iteration X ^ X—P{X) /P'{X) initialized with P. This is just a precomputation 
and moreover the cost is comparable to the rest of the algorithm. Thereafter, it 
is possible to obtain the Frobenius of an element in Zg in time 

Step 3 is a Newton lifting. The cost is bounded by a constant times the 
cost of the last iteration. This last iteration costs a few multiplications between 
objects which are polynomials of degree p over polynomials of degree d — 1 with 
coefficients in Zg. An element of Zg is of bit-size ni', therefore the bit-size of the 
objects is wfid = 0(n^+®). Hence the 0(1) multiplications we have to do in 
the final iteration take time in 0(n^“*'^). Applying Red 1 to the result has the 
same asymptotic complexity (we have to visit the whole object and the runtime 
is linear in its size) but is faster in practice. Finally the overall complexity of 
Step 3 is in 0(n^+®). 

In Step 4 we do a constant number of multiplications (remember r is constant) 
and then an application of Red 1 to objects of size in 0(n^+®). Again the 
complexity is 0(n^+^). Note that in the hyperelliptic case, this step does not 
exist. 

In Step 5 we repeat a reduction process 2g times using Red 2 and Red 3. 
More precisely, Substep 5. a is only reorganizing and applying Red 1; this takes 
negligible time. In Substep 5.b we repeat p times a process which involves ele- 
mentary operations over polynomials of degree at most d over Zg, i.e. a constant 
number of operations in Zg. Hence Substep 5.b has a cost in 0(n^+®). The third 
reduction in Substep 5.c is negligible. 

In Step 6 the costly part is to compute the norm of the matrix. By a recursive 
“divide and conquer” computation, we can save some of the costly Frobenius 
computations and obtain a runtime again in 0(n^“'"®). In [31], Satoh proposes 
another method which can moreover save memory. 

Putting everything together, the complexity of the algorithm is in 

time and in space. 

5 Numerical Results and Cryptographic Significance 

As far as we know, even the original algorithm of Kedlaya has not yet been 
tested in practice. Therefore, we did our first implementation with the first aim of 
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validating Kedlaya’s algorithm and our extension. We used Magma, version 2.7, 
which allowed us to easily manipulate quite complicated objects: it is possible in 
Magma to construct the ring Zg and to build the ring of series over polynomials 
over 7jq which is required. However, by taking such a high programming level, we 
can not really hope to do all the optimizations we could dream of; furthermore, 
there are some small bugs in our version of Magma which make us lose precision 
from time to time and we had to take a (constant) added margin in the precision 
of the computations. Therefore the results we give here are just meant to show 
that the algorithm works in practice and that cryptographic sizes are clearly 
reachable. We are currently working on an optimized implementation in C which 
should reduce the runtime significantly. 

All the examples have been run on an Alpha EV6 at 667 MHz. The numbers 
of points are small enough that it is possible to factor them and prove the results. 
The space requirement was roughly 150 MB. 

5.1 Hyperelliptic Examples 

In the hyperelliptic case, we cannot take a field of characteristic 2 for which the 
algorithm is not designed. We carried out our experiments with finite fields of 
characteristic 3. 

Example 1. 

In F353, we take the generator t given by -I- 2t^ -|- 2t^ + 2t^ -1-1 = 0 and 
consider the genus 2 randomly chosen curve given by 

y2 _ ^5 ^ 23211217987550037030209892.4 _|_ ^ 84440668737166482230725272 . 3 _|_ 

^ 79463430524379401951391412,2 _|_ ^ 10959512142684015392587300 2 - _|_ 
^11366373156356845343093334 

After about 22 hours of computation we found the coefficients of its charac- 
teristic polynomial to be 

01 = 3767947898876, 

02 = 16462680188903823501200294, 

which yields a cardinality of 

N = 375710212613709295385367112322529717794218564821248. 

Example 2. 

We took a randomly chosen curve over the finite field with q = 3^^. Let t 
with minimal polynomial -I- -I- 2t^ -|- 2t -|- 1, and consider the genus 3 curve 

of equation 

j^2 _ 2-7 _|_ ^1450056053378032442,6 _|_ ^3671066185712811072,5 _|_ ^3778136558112258932.4 _|_ 
^472884120990578872,3 _|_ ^558710154046987902,2 _|_ ^2320377850160552192. _|_ 
^286815047052544398 

After about 30 hours of computation we found the coefficients of its charac- 
teristic polynomial to be 
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ai = 1128783670, 

02 = 1117168429648455309, 

03 = 886287268279616285414037148, 

which yields a cardinality of 

N = 91297581893980817420223885655399261733128358845689672. 

5.2 Superelliptic Examples 

For superelliptic curves, we concentrate on characteristic 2 which is the most 
interesting case for practical applications. 

Example 1. 

In F 253 , we take the generator t given by + 1 + 1 = 0 and consider 

the randomly chosen curve 

_ 2_4 _|_ ^2256567407303775^3 _|_ ^75085557911785112,2 _|_ 
^11360270557994672, _|_ ^4967542575384673 

After about 22 hours of computation we found the coefficients of its charac- 
teristic polynomial to be 

ai = 0 , 

02 = -2299871474212151, 

03 = 0, 

which yields a cardinality of 

N = 730750818665451438386441787834386121601727865546. 

The nullity of oi and 02 is not a surprise: it is explained by the absence of 
third roots of unity in the base field (see below). 

Example 2. 

In F 258 , we take the generator t given by t®® -I- -I- 1 = 0 and consider the 

random curve 

y3 _ 2;4 _|_ ^1844168987229998622,3 _|_ ^1381535541621180622,2 _|_ 
^900539853625975462; _|_ ^159188191651769175 

After about 28 hours of computation we found the coefficients of its charac- 
teristic polynomial to be 

ai = 1346491223, 

02 = 540650236559852363, 

03 = 106786896758507851646763008, 

which yields a cardinality of 

N = 23945242937891627923322882122316144789744381897954979. 

The cardinalities we found were not (almost) prime and these curves should 
not be used in cryptography. We could have repeated the computations for sev- 
eral curves until we found a good curve. Note that an early-abort strategy cannot 
be used in this context. 
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5.3 Some Remarks about Superelliptic Curves in Cryptography 

When one wants to build a cryptosystem based upon a curve, there are some 
security issues that have to be taken into account. Besides the fact that the 
number of points of the Jacobian should be (almost) prime, the following attacks 
(or threats) should be avoided: 

1. Index-calculus attack for high genus curves [14,4]: the genus of the curve 
should be at most 3. 

2. MOV attack [25,11]: the smallest k such that #J(C) | — 1 should be large. 

3. Ruck’s attack [30]: the order of the subgroup in which we are working should 
be coprime to p. 

4. The curve should not have “special properties”. 

Item 1 means that we are left with a small choice of non-elliptic curves useful 
for cryptography: hyperelliptic curves of genus 2 and 3, and superelliptic curves 
of the form = f{x) with / of degree 4. 

Items 2 and 3 are almost always fulfilled when we choose random curves and 
the verification that this is indeed the case for a given curve is straightforward. 

The fourth item is less precise but has its importance: nowadays some people 
do not recommend to use elliptic curves for which the class number of the ring 
of endomorphism is too small; the base field should be a prime field or a prime 
extension field due to the threat of an attack by Weil descent [16]; and more 
generally any special behavior of the curves could be considered as suspect. 

Keeping all this in mind, consider now a curve C of the form = f{x) with / 
of degree 4 over a field F^. Assume that q is congruent to 2 modulo 3. Then every 
element of F, is a cube. Therefore #C/Fg is equal to g -I- 1, counting the point 
at infinity. Furthermore this is the case in every extension of F^ which does not 
contain the third roots of unity, namely every odd degree extension. A simple 
calculation with zeta functions shows that this implies that the coefficients a\ 
and 03 in the characteristic polynomial are zero, therefore x(^) is of the form 
T6 -h 02T4 -h go2T2 -h g3, as we observed in Example 1. It means that this curve 
is highly “non-random” among all the curves of genus 3. In particular, the 3- 
torsion part of the Jacobian is partly degenerate which is a first step towards 
supersingularity. In [35], the reader will find a survey about the gradation from 
ordinary curves to supersingular curves and the link with the Newton polygon 
of the characteristic polynomial. 

Having noticed this, one is tempted to claim that it is safer to take a base field 
which includes the third roots of unity. However, we are confronted to another 
problem, at least in characteristic 2. Indeed, F 2 « will contain the third root of 
unity if and only if n is even. We could then be subject to a Weil descent attack: 
if n = 2m, by doing a Weil restriction on J(C), we get an abelian variety of 
dimension 6 over F 2 "i. If someone is able to draw a curve of genus 6 on this 
abelian variety, then the system is broken. As far as we know, nobody is able to 
find such a curve (if it exists!) but this could be threatening enough to discourage 
the use C for cryptography. 
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This phenomenon is only true when one wants to use a base field of small 
characteristic. If we use a curve over a prime field, no Weil descent attack is to 
be feared and one can take a base field with roots of unity. This implies that 
there are additional automorphisms in the Jacobian and that the key-size should 
be slightly enlarged accordingly [8]. 

6 Conclusion 

We have presented an extension of Kedlaya’s algorithm in order to count points 
on superelliptic curves over finite fields of small characteristic. The time com- 
plexity is the same as the complexity for hyperelliptic curves. This complex- 
ity is asymptotically the same as the best known methods for counting points 
on elliptic curves. Note however that the e which is involved in the expres- 
sion 0(log^^® q) does not hide the same logarithmic factors. We obtained some 
numerical examples proving that it is now feasible to count points of random 
hyperelliptic and superelliptic curves up to genus 3, for cryptographic sizes. 

Further research topics are: extend Kedlaya’s algorithm for hyperelliptic 
curves to characteristic 2, reduce the space complexity to 0(log^ q), extend the 
algorithm to Cat curves or even to more general varieties (in fact Monsky- 
Washnitzer cohomology exists for more general varieties). 
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Abstract. Frey and Riick gave a method to transform the discrete loga- 
rithm problem in the divisor class group of a curve over Fq into a discrete 
logarithm problem in some finite field extension . The discrete loga- 
rithm problem can therefore be solved using index calculus algorithms 
as long as k is small. 

In the elliptic curve case it was shown by Menezes, Okamoto and Van- 
stone that for supersingular curves one has fc < 6. In this paper curves of 
higher genus are studied. Bounds on the possible values for k in the case 
of supersingular curves are given which imply that supersingular curves 
are weaker than the general case for cryptography. Ways to ensure that 
a curve is not supersingular are also discussed. 

A constructive application of supersingular curves to cryptography is 
given, by generalising an identity-based cryptosystem due to Boneh 
and Franklin. The generalised scheme provides a significant reduction 
in bandwidth compared with the original scheme. 



1 Introduction 

Frey and Riick [8] described how the Tate pairing can be used to map the discrete 
logarithm problem in the divisor class group of a curve C over a finite field Fg 
into the multiplicative group F**, of some extension of the base field. This has 
significant implications for cryptography as there are well-known subexponential 
algorithms for solving the discrete logarithm problem in a finite field. Therefore, 
there is a method for solving the discrete logarithm problem in the divisor class 
group in those cases where the extension degree k is small. 

The extension degree required is the smallest integer k such that the large 
prime order I of the divisor class group Pic^(Fq) is such that l\{q^ — l). In general 
the value of k depends on both the field and the curve and is very large (i.e., 
log(fc) « log((7)). 

Menezes, Okamoto and Vanstone [23] showed that for supersingular elliptic 
curves the value k above is always less than or equal to 6. This important result 
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implies that supersingular elliptic curves are weaker than the general case for 
cryptography. 

Elliptic curve cryptography was generalised to higher genus curves by Koblitz 
[16]. Our main result is Theorem 3 which states that for supersingular curves 
there is an upper bound, which depends only on the genus, on the values of the 
extension degree k. This bound is sufficiently small (see Table 1) that supersin- 
gular curves should be considered weaker than the general case for cryptography. 

It is important to be able to detect these weak cases in advance, especially 
when one is considering curves defined over small fields and using the zeta func- 
tion to compute the group order over extension fields. Sakai, Sakurai and Ishizuka 
[27] were unable to find any secure hyperelliptic curves of genus two over F 2 . In 
Section 5 we show why the authors of [27] failed in their search and we explain 
how to avoid equations for supersingular curves in characteristic two. As an il- 
lustration we overcome the problem encountered in [27] and provide examples 
of secure genus two curves over F 2 . 

Recently, beginning with the work of Joux [14], the Weil pairing has found 
positive applications in cryptography. In Section 3 we generalise an identity- 
based cryptosystem due to Boneh and Franklin [2] . Our scheme provides a sig- 
nificant improvement in bandwidth over the scheme of Boneh and Franklin. 



2 The Tate Pairing 

In this section we summarise various known results. Throughout the paper C 
is a non-singular, irreducible curve of genus g over a finite field Fg where g is a 
power of a prime p. The Jacobian of the curve C is an abelian variety Jac((7) of 
dimension g defined over Fg. The Fg-rational points on the Jacobian correspond 
to the divisor class group of the curve over Fg, which we denote Pic^(Fg) (for 
background details see [4], [16], [29], [33]). 

Those readers only interested in elliptic curves can take C to be an elliptic 
curve and can think of Jac(C)(Fg) = Pic^(Fg) = C'(Fg). 



2.1 The Tate Pairing 

Let I be a positive integer which is coprime to q. In most applications I is a prime 
and ?|:;(iPic^(Fg). Let fc be a positive integer such that the field Fg^ contains the 
Ith roots of unity (in other words, IKq’^ — 1)). 

Let G = Picp(Fgfc) and write G[l] for the subgroup of divisors of order I and 
G/IG for the quotient group The Tate pairing is a mapping 

{;-)-.G[l]xG/lG^¥l,/{¥l,y ( 1 ) 

where the right hand side is the quotient group of elements of F*^ modulo /th 
powers. Note that all three groups G[l], G/IG and F*fc/(F*j,)* have exponent 1. 
The Tate pairing satisfies the following properties [8]: 
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1. (Well-defined) (0,Q) G (F*,)' for all Q G G and {P,Q) G (F*,)' for all 
P£G[1] and all Q G IG. 

2. (Non-degeneracy) For each divisor class P G G\l] — {0} there is some divisor 
class Q G G such that {P, Q) ^ (F**.)^ 

3. (Bilinearity) For any integer n, (nP,Q) = (P,nQ) = (P,Q)^ modulo /th 
powers. 

The Tate pairing is computed as follows: Let P be a divisor of order 1. 
There is a function / whose divisor, which we write as (/), is equal to IP. 
Then (P, Q) = f{Q') where Q' is a divisor in the same class as Q such that 
the support of Q' is disjoint with the support of (/). This computation is easily 
implemented in practice by using the double and add algorithm and evaluating 
all the intermediate functions at Q' (see [8], [9]). 

The value f{Q') lies in F*^. By raising it to the power (g^ — l)/l we obtain 
an fth root of unity. 

One subtlety when implementing the Tate pairing is finding a divisor Q' 
with support disjoint from the partial terms in the addition chain for IP. In the 
elliptic curve case this is done by taking Q' = (Q + S) — (S) where (Q) — (oo) 
is the target divisor and where S is an arbitrary point (not necessarily of order 
I). In the higher genus case general Riemann-Roch algorithms can be used to 
give an analogous solution. In practice, it is often easier not to choose the class 
Q first but to just choose two ‘random’ effective divisors Pi and P 2 of degree 
g and set Q' = Pi — P 2 . If Pi and P 2 are chosen randomly over F^ic then with 
high probability we expect (P, Q') ^ (F*^)^ 

In the case of elliptic curves one can compare the Tate pairing with the Weil 
pairing. In general there is no relationship between the Tate pairing and the Weil 
pairing, as they are defined on different sets. However, when P is an elliptic curve 
such that /^||#P(Fqfc) and P, Q are independent points in P(Fgic)[l] then we have 
ei{P,Q) = {P,Q)/{Q,P). A consequence of this is that the Tate pairing is not 
symmetric. 

The Weil pairing requires working over the field Fg(P[?]) generated by the 
coordinates of all the /-division points. In general, one would expect this field 
to be larger than that used for the Tate pairing, however at ECC ’97 Koblitz 
observed that these fields are usually the same. Finally, the Weil pairing requires 
roughly twice the computation time as the Tate pairing, although this is partly 
offset by the added cost of a finite field exponentiation (to the power {q^ — 1)//) 
in the case of the Tate pairing if a unique value is required. 

2.2 The Prey-Riick Attack 

We now recall how the Tate pairing is used to attack the discrete logarithm 
problem in the divisor class group of a curve (this approach is often called the 
Frey-Riick attack, after [8]). Let Pi,P 2 G Pic^(Fg) be divisors of order I for 
which we want to solve the discrete logarithm problem P 2 = APi . Let k be the 
smallest integer such that the pairing is non-degenerate (hence l\{q^ — 1)). The 
method proceeds as follows: 
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1. Choose random divisors Q G Pic^(Fqic) until (Di,Q) ^ {V*kY- 

2. Compute Q = {D^,Q) G F*^. 

3. Raise Q to the power {q^ — 1)/^ (this stage is optional since the linear algebra 
in the index calculus method below should be performed modulo /). 

4. Solve the discrete logarithm problem (2 = Ci in the finite field F*j, using an 
index calculus method. 

This strategy is practical when k is small. This leads to the following impor- 
tant question for cryptography: 

Question: Are there certain weak cases of curves for which k is always small? 

One of the goals of this paper is to show that, as in the case of elliptic 
curves, supersingular curves always have small k. Of course, there are lots of 
non-supersingular curves for which the Frey-Riick attack applies (e.g., elliptic 
curves over Fp with p — 1 points). 



2.3 Non- degeneracy of the Tate Pairing 

We now discuss the non-degeneracy property a little more closely. Let P G G[l]. 
We consider the possibilities for {P, P) . To compute (P, P) it is necessary to 
compute a divisor Q in the same class as P but which has support disjoint from 
all the intermediate terms in the computation of IP. One can then compute 
{P,Q) to obtain the value of the pairing, li P G IG then (P, P) G (F**,)^ If 
P G PiCp(Fq) then (P, P) G F*, but if I is prime and if I does not divide (g — 1) 
then (P, P) G (F*s,)^ since every element of F*;, is an /th power in that case. 
Hence to have (P, P) nontrivial it is necessary (but not sufficient) that l\{q — 1) 
and so fc = 1. 

The following result originates from the work of [2] and [36]. It provides a 
very useful technique for finding points where the pairing is non-degenerate. 

Lemma 1. Let E be an elliptic curve. Let P G P(Fq) he a point of prime 
order 1. Let F^ic he the extension over which all points of order I are defined, 
and write G = E(¥gk). Suppose that P\\ffG (i.e., that G[l] = G/IG). Let if 
be an endomorphism of E which is not defined over F,. Lf if {P) ^ E(¥q) then 
{P,lf{P))<-<l'‘-^1/^ yf 1. 

For the proof see the full version [11]. We refer to the maps if as ‘non-F^- 
rational endomorphisms’ (Verheul [36] calls them ‘distortion maps’). 

In the case of curves of genus greater than one then this result is no longer 
true. On the other hand, in this setting there are usually many endomorphisms 
if available. Indeed, for supersingular abelian varieties it will generally be true 
that, for all P, there is some endomorphism if such that {P,if{P))G - 1 )/* ^ l. 
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3 Identity-Based Cryptosystems Using the Tate Pairing 

Identity based cryptography was proposed by Shamir [28] as a response to the 
problem of managing public keys. The basic principle is that it should be possible 
to derive a user’s public data only from their identity. It is therefore necessary to 
have a trusted dealer who can provide a user with the secret key corresponding 
to the public key which is derived from their identity. It has turned out to be 
rather difficult to construct efficient and secure identity-based cryptosystems. 

Recently, Boneh and Franklin [2] developed a new identity-based cryptosys- 
tem using the Weil pairing on a specific supersingular elliptic curve. In this 
section we show that the use of other supersingular curves leads to significant 
efficiency improvements over the original scheme. 

3.1 Dealer’s System Parameters 

The dealer sets up the scheme by choosing a finite field and a curve C over 
Fg of genus g such that: 

1. There is a large prime I dividing the order of the group Pic^(Fg). 

2. The degree k needed for the Tate pairing embedding of the subgroup of order 
I (i.e., the smallest k such that — 1)) is relatively small. 

One approach is to take C to be a supersingular curve. 

The dealer then chooses a divisor P £ Pic^(Fg) of order I and a secret 
integer 1 < s < / and computes P' = sP. The dealer publishes q,C,l,k,P,P' 
and keeps the integer s secret. The public data for the scheme also includes two 
hash functions Hi and H 2 (these are called G and H in [2]). The function Hi 
is used to map identities to bitstrings which are then used to represent divisors 
in Pic^(Fqfe). The function H 2 maps elements of the subgroup of order I of 
F*s, to bitstrings of a certain length N. Both hash functions are required to be 
cryptographically strong and are modelled in the security proofs of [2] as random 
oracles. 

3.2 User’s Public and Private Key 

We now discuss how a user’s identity gives rise to a public key. There must be a 
procedure to convert the identity of user A (such as their name or email address), 
to a divisor Qa £ G = Pic^(Fgfc) such that: 

1. (p,g^)^(F*,)b 

2. The process should be one-way, in the sense that it be infeasible to find an 
identity which gives rise to a given point Qa- 

3. The points Qa should be distributed uniformly in an appropriate set. 

In [2] this process (which Boneh and Franklin call ‘MapToPoint’) is solved 
using a cryptographically strong hash function Hi and a non-Fg-rational endo- 
morphism xp. We now sketch a generalisation of their method. 
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The identity bitstring is concatenated with a padding string and then passed 
through the hash function Hi (which is constructed to yield a full domain out- 
put). This process is repeated using a deterministic sequence of padding strings 
until the output is the a;-coordinate (or a(a:)-term in the higher genus case) of 
an element Q of Pic^(Fg). It is then easy to find the rest of the representation 
of Q. One then sets Qa = G G for a suitable non-Fg-rational endomor- 

phism from the available possibilities where m is the cofactor =ffPicQ{¥q)/l. This 
process is repeated until (P, 1- 

A more general scheme, which does not require non-Fg-rational endomor- 
phisms, is given in [11]. 

To summarise, every user A has a public key consisting of the divisor Qa and 
everyone can obtain this public key just knowing the identity of the user. Each 
user asks the dealer for a private key Q'^i = sQa- This must be transmitted to 
the user using a secure channel. 



3.3 Encryption and Decryption 

Let the message M be a bitstring of length N and suppose we want to send this 
to user A. First derive the public key Qa from the identity of A and obtain the 
dealer’s public keys P and P'. The remaining steps are 

1. Choose a random integer 1 < r < /. 

2. Compute R = rP. 

3. Compute S = M (B i^2((P^ (Recall that {P' ,Qa) G ^^k-) 

4. Send (R,S). 

To decrypt, user A simply uses their private key Q'^ to compute (P, 
Recall that (rP, sQa) = (P, QaY“ = (P^ QaY modulo Rh powers. Hence the 
message is recovered from 

M = S®H2{{R, 

A more versatile encryption process is obtained by using 
H 2 HP' , Q aY^'^ “!)/*) as the key for a fixed symmetric encryption function. 

3.4 Security 

The security of this system relies on the following variant of the Diffie-Hellman 
problem: 

Definition 1. The Tate- Diffie-Hellman problem (TDH) is the following: 
Let G and I be as above. Given divisors P, P' — sP, R = rP and Qa & G of 
order I such that {P,QaY‘‘’^~^'’^^ Y 1 compute f = {P,QaY^^'^'"~^'^^^ ■ 

Let P G Pic^(Fq) be any divisor of large prime order 1. We make the as- 
sumption that the Tate-Diffie-Hellman problem is hard over random P', P, Qa, 
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i.e., where Qa = V'(Q) (for ^ suitable non-Fq-rational endomorphism) and where 
P', R,Qg {P) are chosen uniformly at random. 

If one can solve the elliptic curve Diffie-Hellman problem then one can com- 
pute rsP and thus (rsP, Qa) - Similarly, if one can solve the Diffie-Hellman prob- 
lem in F*i, then one can solve the TDH. 

qK 

To produce a cryptosystem with strong security properties (indistinguisha- 
bility of encryptions under a chosen ciphertext attack) one uses a method of 
Fujisaki and Okamoto which is discussed thoroughly in [2]. First it is necessary 
to establish that the basic scheme has the ‘one-way encryption’ (ID-OWE) se- 
curity property (see Section 2 of [2]). The security proof for the scheme above is 
completely analogous to the proof of Theorem 4.1 of [2] and it holds under the 
assumptions that the hash functions Hi and H 2 are random oracles and that 
the TDH problem is hard. 

3.5 Parameter Sizes and Performance 

For security it is necessary that > 2^®° and Boneh and Franklin 

[2] use g = 1 and k = 2 and so they must take q to be of size at least 512 bits^. 
The whole point of our generalisation is the observation that if k can be taken 
to be larger than 2 then q may be taken to be smaller. In Section 3.6 we give 
the details for a curve with k = 6. Hence there are the following advantages of 
the generalised scheme compared with the scheme of [2] . 

— The bandwidth (number of bits) of an encryption (i?, S) can be reduced (see 
Section 3.6 below). 

— For the same reason, the dealer’s public keys also require less storage and 
communication bandwidth with the new scheme. 

— The dominant cost in encryption and decryption is the evaluation of the Tate 
pairing. Since this involves computations in the large field F^ic the cost of 
encryption and decryption is roughly comparable for both schemes, although 
there are some savings available in characteristic two. 

As mentioned in [2], the computation of the Weil and Tate pairings can be 
made much faster by choosing the prime I of size around 160 bits. 

3.6 Characteristic Three Example 

With elliptic curves one can realise an improvement of k from 2 to 6 by taking 
the elliptic curves 



El : — X + 1 and E 2 ■ = x^ — x — 1 

over F 3 i, which have characteristic polynomial of Frobenius Pei(X) = -|- 

3A -I- 3 and PE^iX) = — 3A -|- 3 respectively. These curves are thoroughly 

discussed by Koblitz in [18]. 

Actually, in [2] it is specified that q have 1024 bits, but 512 bits seems to be sufficient. 



1 




502 



S.D. Galbraith 



A convenient non-Fa-rational endomorphism for these curves is 
V' : (x,y) {-a - x,iy) 

where i G F32 satisfies = —1 and a G F 33 satisfies — a + 1 = 0. 

We list some values of m such that the group order of Ei{¥ 3 m) is equal to a 
small cofactor c times a large prime 1. 



m 


i 


# bits in 1 


c 


79 


2 


125 


1 


97 


1 


151 


7 


149 


1 


220 


7-15199 


163 


1 


256 


7 


163 


1 


259 


1 


167 


1 


262 


7 


167 


2 


237 


8017 • 44089 


173 


2 


241 


16420688749 


193 


2 


306 


1 


239 


2 


379 


1 



Consider, say, the case m = 163 which is a 259 bit field. Since k = 6 the 
size of the field F^* is 1551 bits. If messages are of length N = 160 bits then 
an encryption requires 160 + 260 = 420 bits (259 bits for the x-coordinate of 
the point and one bit to specify the y-coordinate) . For equivalent security using 
the Boneh-Franklin scheme with k = 2 one must take p to be [1551/2] = 776 
bits and so an encryption will require 160 + 776 = 936 bits (we have 776 as 
the Boneh-Franklin scheme only requires sending the y-coordinate) . Hence our 
scheme requires less than half the bandwidth of the Boneh-Franklin scheme for 
the same security level. 



3.7 Characteristic Two Example 

In characteristic two there are curves available which attain the Frey-Riick em- 
bedding degree A: = 4. In these cases the bandwidth improvement is not as 
significant as that seen with the characteristic three example above. However, it 
is easy to get an improvement in performance over the scheme in [2]. 

Consider the elliptic curves 



El : y"^ + y = + X and E 2 '■ y^ + y = x^ + x + 1 

over F 2 . Then Ei has characteristic polynomial of Frobenius Pei{X) = -|- 

2X + 2 while E 2 is the quadratic twist of E\ and has Pe 2 {X) = — 2A -|- 2. 

We list some values of m such that ^Ei{¥ 2 ^) = cl where I is a large prime 
and where c is a cofactor. 
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m 


i 


# bits in 1 


c 


233 


I 


210 


5 • 3108221 


239 


2 


239 


1 


241 


2 


241 


1 


271 


1 


252 


5 • 97561 


283 


1 


281 


5 


283 


2 


283 


1 


353 


2 


353 


1 


367 


2 


367 


1 


397 


2 


397 


1 


457 


2 


457 


1 



by 



A convenient non-F2-rational endomorphism for both these curves is given 
Ip : (x, y) {v?x + 2/ + v?sx + s) 



where u G F22 satisfies + tt + 1 = 0 and s G F24 satisfies + (tt + l)s + 1 = 0. 

We give a comparison between characteristic 2 and large characteristic p 
for equivalent sized finite fields. We give the average time (in seconds) for the 
computation of the Tate pairing and the finite field exponentiation using the 
Magma computer algebra package. We also give a comparison of the commu- 
nication bandwidth (number of bits) for the basic scheme (assuming a 160 bit 
hash function H). 

The first case is with 965 bit finite field security (i.e., using E 2 over F2241, 
which has a prime number of points) . 



Characteristic 


Time 


Bandwidth 


2 


2.4 


402 


P 


4.3 


642 



Now for 1132 bit finite field security. This time using Ai(F2283) whose number 
of points is 5 times a prime. 



Characteristic 


Time 


Bandwidth 


2 


3.4 


444 


P 


6.1 


726 



Clearly, the elliptic curves used by Boneh and Franklin lead to a scheme 
which requires about twice the computation time and over one and a half times 
the bandwidth compared with using curves in characteristic two. 



3.8 Open Questions 

We have seen that larger values of k help to make a more efficient identity-based 
cryptosystem. The problem is therefore to find curves C which have suitable 
large values of k (without being too large). This is very closely related to the 
question of Section 2.2 
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For supersingular curves we will show in Section 4.3 that there is an upper 
bound k{g) (depending only on the genus g) for the values of k. The values of k{g) 
are large enough to give good performance for the identity-based cryptosystem. 
However, it seems that one cannot realise these large values for k{g) with suitable 
Jacobians of curves. It seems that the supersingular elliptic curves with k = 4 
and k = 6 are the optimal choice for the identity-based cryptosystem and other 
applications using supersingular curves. More research is needed to clarify this. 

It is not necessary to insist on using supersingular curves for the identity- 
based cryptosystem, since there should exist non-supersingular elliptic curves E 
over certain finite fields with relatively small values of k. However, for such E 
it is usually the case that the order of E{¥q) is not divisible by a large prime (one 
exception is the case p = 21 + 1, but these only have k = 1). This phenomenon is 
indicated by the results of Balasubramanian and Koblitz [1] and is confirmed by 
computer experiments. It would be extremely interesting to have a construction 
for non-supersingular curves with relatively small values of k. 

4 Supersingular Curves over Finite Fields 

In this section we recall some facts about supersingular curves and we give our 
main result (Theorem 3). More details can be found in the full version of this 
paper [11]. 

As before, C is a non-singular, irreducible curve of genus g over a finite 
field Fg. The Frobenius endomorphism tt on Jac(C) satisfies a characteristic 
polynomial P{X) of degree 2g with integer coefficients. We can factor P{X) 
over the complex numbers as P{X) = rii=i(^ ~ turns out that the 

algebraic integers have certain remarkable properties. In particular: 

1. The numbers satisfy \ai\ = ^/q and they can be indexed such that 

2. P{X) has the following form 

+ 02X29-2 + . . . + agXS + gag-iX^-^ + • • • + q^-^aiX + q^ . 

3. For any integer r > 1 we have #C(Fgr) = q^ + I — 

4. For any integer r > 1 we have #Jac(C')(Fgr) = ni£i(l ~ Q^i)- 

The formula of property 4 for ^Jac(C')(Fgr) gives an efficient method for 
computing the number of points in the divisor class group of a curve over a large- 
degree extension of the field F^ once one has computed P{X) (see Appendix 1 
for details about computing P(X)). For cryptography one wants a curve such 
that #Jac(C')(Fqr) is divisible by a large prime I and such that the group resists 
the known attacks ([8], [26]) on the discrete logarithm problem. 

A common strategy is to try values of r until one is found for which the large 
prime I satisfies gcd{l, q) = 1 and ^ 1 (mod 1) for ‘small’ k. If the original 

curve is supersingular then, as we will show, it is futile to try many different 
values for r since the Frey-Riick attack will always work. Hence, it is important 
to know that such curves should be discarded right from the start. 
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4.1 Supersingularity 

Recall that an elliptic curve E over Fpm is supersingular if -E(Fp) has no points 
of order p (see [29] ) . 

Definition 2. ( Oort [24]) An abelian variety A over Fg is ealled supersingular 
if A is isogenous to a product of supersingular elliptic curves. A curve C over 
Fg is called supersingular z/ Jac(C) is super singular. 

The following result follows from the work of Manin and Oort. 

Theorem 1. The following conditions on an abelian variety A over F^ of di- 
mension g are equivalent. 

1. A is supersingular. 

2. A is isogenous ( over some finite extension of ¥q) to for some supersin- 
gular elliptic curve E. 

3. There is some integer k such that the characteristic polynomial of Frobenius 

on A over F^fc is P{X) = (X ± . 

4 . There is some integer k such that = ±< 7 ^/^. 

5. For some positive integer k we have ][A{¥qk) = ± 1)^®. 

The fourth property is the one which is most important for our application. 

4.2 A Criterion for Supersingularity 

The following result follows from Proposition 1 of Stichtenoth and Xing [34]. It 
gives a simple test for whether or not an abelian variety is supersingular, once 
P{X) has been computed. 

Theorem 2. Suppose q = p^ and suppose that A is an abelian variety of di- 
mension g over F,. Let P{X) = A^® + + • • • + a^A® + • • • + g® be 

the characteristic polynomial of the Frobenius endomorphism on A. Then A is 
supersingular if and only if, for all 1 < j < g, 

pr®"/2l I 

4.3 The Bound on the Extension Degree 

The values of k which arise depend on properties of cyclotomic polynomials (i.e., 
irreducible factors over Z of A™ — 1 for some m). Hence we make the following 
definitions. 

Definition 3. For each positive integer g let Vg = {p{X) G Z[A] : degp(A) = 
2g,p(X) irreducible over Z,p(A)|(A™ — 1) for some m\. For each p{X) G Vg 
define m{p{X)) = min{m : p(A)|(A™ — 1)}. Define k'{g) to be max{m(p(A)) : 
p{X) G Vg}. Define k{g) to be 

n 

max{lcm(m(pi(A)),. .. ,m(p„(A))) : g = pi{X) G "PgJ. 
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We now state our main result. We emphasise that the bound k{g) depends 
only on the genus and not on the abelian variety A. 

Theorem 3. Let A he a supersingular abelian variety of dimension g over a 
field Fq, then there exists an integer k < k{g) such that, for all integers r > 1, 
the exponent of the group T(Fgr-) divides g*’’ — 1. 

Proof. First, take a quadratic extension so that g’’ is a square, i.e., consider qq = 

. Let P{X) be the characteristic polynomial of the Frobenius endomorphism 
on A over F^^ and write for the roots (they are the squares of the values of 
the roots corresponding to A over F^). 

We follow the proof of Theorem 4.2 of Oort [24] and consider 

P'{X) = P{^^X)/ql = ^29 + (ai/V^)X29-i + . . . + 1 

which has roots aij ^fqA- By Theorem 2 the coefficients of P'{X) are integers. 

The numbers OLij .^fqf are algebraic integers which are units but, by Theorem 
4.1 of Manin [21], it follows that they are actually roots of unity. Therefore P'{X) 
is a product of cyclotomic polynomials. 

By definition of k{g) there is some k < k{g) such that {ai/y/qf)’^ = 1 for all i. 
In other words, for all i and so = q^"^ ■ For all points P € Picp(Fgr) 

we have P = 7t’’(P) = [< 7 q*^^]P. It follows that the exponent of A(Fgfc) divides 

^fc/ 2 _i ggg Stichtenoth and Xing [34] Proposition 2). Since — \ — l 

the result is proven. □ 

We now consider the values of k{g). Cyclotomic polynomials X'^ — 1 factor 
into products of polynomials •L>n{X) for each n|m (see Lang [19] VI.3). The 
polynomials L>n{X) have degree pin) (this is the Euler (,3- function) so the values 
of k'ig) are related to the problem of finding the largest value of n for which 
pin) = 2g. The extremal case is when n is the product of the first k primes and 
so pin) = (®-g-> Pi^) = 2, </?(30) = 8, <p(210) = 48 etc). The values 

of kig) relate to the ways of taking least common multiples of the m(p(X)). 



Table 1. Values of kig). The symbol * indicates the fact that there are no irreducible 
cyclotomic polynomials of degree 14 (since there are no integers N with p)N) = 14). 



g 


k'ig) 


kig) 


kig)/g 


1 


6 


6 


6 


2 


12 


12 


6 


3 


18 


30 = lcm(6, 10) 


10 


4 


30 


60 = lcm(10, 12) 


15 


5 


22 


120 = lcm(8, 10, 6) 


24 


6 


42 


210 = lcm(6,10,14) 


30 


7 


★ 


420 = lcm(5,7,12) 


60 


8 


60 


840 = lcm(3,5,7,8) 


105 
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Table 1 gives some values for k{g). We only list values for (/ < 8 since there 
are various algorithms (see [12]) for solving the discrete logarithm problem on 
high-genus curves. The notation indicates how the maximum value is attained. 
For example the case fc(3) = 30 comes from the cyclotomic polynomials ^q{X) = 
X'^ — X + 1 and (Pio{X) = X'^ — X^ + X^ — X -|- 1. It follows that the smallest 
degree m such that (Pe{X)<Pio{X)\{X"^ — 1) is m = lcm(6, 10) = 30. Hence an 
abelian variety with P{X) = q^'pQ{X / y/q)(l>io{X / ^) (which must exist by the 
Honda-Tate theorem [35]) would have embedding degree 30. However, we have 
not found a curve whose Jacobian is isogenous to such an abelian variety. 

The bounds given are sharp, in the sense that there exists an abelian variety 
over some finite field Fg for which the bound k{g) is attained (note also that 
we recover the bound k = 6 in the elliptic curve case). However, we are more 
interested in Jacobian varieties of curves than in general abelian varieties. It is 
therefore important to determine which values for k can arise as the Jacobian of 
a curve. We return to this problem in Section 4.4. 

What do these results tell us about the security of the discrete logarithm 
problem in the divisor class group of a curve? Recall that the advantage of the 
divisor class group of a curve of genus g over is that, over a field Fg the 
group has size approximately . Hence, to determine the applicability of the 
subexponential algorithms for solving the discrete logarithm problem in finite 
fields, we really should consider the ratio k{g) /g, which is seen in Table 1 to grow 
rather slowly. This supports the notion that supersingular curves are weaker than 
the general case for standard discrete logarithm based cryptosystems. 

4.4 Are Large Values of k Attained for Curves? 

In this section some examples of curves with relatively large values for k are 
given (see Table 2). When ^ > 2 it is seen that the values are much smaller than 
the upper bounds given in Table 1 . It is an interesting open problem to find the 
exact largest values of k for each genus, and we hope that this paper motivates 
further work on the problem. 

The fact that the maximum value of k is attained in the case of genus one 
and two curves is not surprising since every elliptic curve is a Jacobian, and 
every isogeny class of abelian varieties of dimension two contains a representative 
which is either a product of elliptic curves or the Jacobian of a hyperelliptic curve 
(possibly this process requires an extension of the ground field). However, in the 
case of dimension four or more we would not necessarily expect the bounds to 
be attained. 

The case of dimension three is particularly interesting. Simple abelian vari- 
eties of dimension three should be isogenous to a Jacobian of a genus three curve 
(not necessarily hyperelliptic) over some extension field. However, we have not 
found any supersingular curves giving large values of k. Further, we have not 
found any supersingular hyperelliptic curves of genus three in characteristic two. 

The reason for only listing curves defined over small fields is that, for elliptic 
curves, one can only obtain k > 3 in characteristic two or three, and we expect 
analogous results in the higher genus case. 
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Table 2. Table of curves with large k. Notes: 

(1) In the first row p must be an odd prime congruent to 2 modulo 3. 

(2) This genus 3 curve is a plane quartic and is not hyperelliptic. It can be written as 

the affine superelliptic curve 2® = + Ox^ . 



Field 


Curve 


Genus 


fh points 


k 


Fj, 


y'^ = + a 


1 


p + 1 


2 


F3 


= a;® + 2a; ± 1 


1 


7,1 


6 


F2 


y-‘ -\- y = x^ + x'^ 


2 


13 


12 


F3 


= a;® + a; + 2 


2 


13 


3 


Fs 


= a;® + 2a; + a;® + a; + 3 


2 


11 


5 


F22 = F2(0) 


a;'^ + Oxy'^ + yz^ 


3 


57 


9 


F3 


= a;’^ + 1 


3 


28 


6 


Fs 


= a;® + 2a;'‘ + 3a;^ + 2 


3 


66 


10 


F7 


y"^ = x^ + x'^ + 5a;® 


3 


911 


14 


F2 


y‘‘ -\- y = x'^ + x“^ G 1 


4 


5 


12 



5 Equations of Supersingular Curves 

For applications, especially when using subfield curves, it is very important to 
know in advance which equations are likely to give rise to supersingular curves. 
For instance, Sakai, Sakurai and Ishizuka [27] suggested some hyperelliptic curves 
for use in cryptography. On page 172 they state that they were unable to find 
any secure genus 2 curves over F2 and speculated that this was caused by their 
restriction to the field F2 (instead of using F2"). In fact, the reason for this is that 
they only considered equations of the form C : + y = f{x) with f{x) G F2[a;j 

monic of degree 5. We will show that all genus two curves of this form over all 
fields F2" are supersingular. 

The first observation is that any hyperelliptic curve in characteristic two of 
the form y^ + h{x)y = f{x) with 1 < deg{h{x)) < g+1 cannot be supersingular. 
To see this note that any root Xq of h{x) gives rise to a point (xo,yo) (possibly 
over a quadratic extension) of order 2, but a supersingular curve in characteristic 
p has no points (even over algebraic extensions) of order p. 

Therefore, curves of the form y^ + y = f{x) are a poor choice in characteristic 
two if one wants to avoid supersingular cases. However, the argument sketched 
above does not imply that all such curves are necessarily supersingular. Indeed, 
there are curves of this form which are not supersingular when the genus is three 
or more. Our main result in this section is that all such curves are supersingular 
in the case of genus two. 

Theorem 4. Let C he a genus 2 curve over F2»i of the form y^ + cy = f{x) 
where f{x) is monic of degree 5 and c G F^n . Then C is super singular. 

Before giving the proof of the theorem it is necessary to obtain the following 
result about the polynomials P{X) for curves of this form. 
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Lemma 2. Let C he a genus 2 eurve over F 2 >* of the form + cy = f{x) where 
f{x) is monic of degree 5 and c € F^n . Then the coefficients ai and 02 in the 
polynomial P{X) are both even. 

Proof. For equations of this form the number of points on the curve over all 
extensions F 2 "m is odd, since apart from the point at infinity, points come in 
pairs (xo,yo) and {xo,yo + c). The fact that #C(F 2 «) = 2” + l — m is odd implies 
that Oi is even. 

On C'(F 22 n) there are two points for each possible xq € F 2 « (the correspond- 
ing y-coordinates may be in F 2 ~ or F 22 n). For any point with xg ^ F 2 »i there are 
the four distinct ‘conjugates’ (xo,yo), (xo,yo+c), (7t(xo), 7r(j/o)), (7r(a;o), 7r(yo)+c) 
where tt is the Frobenius automorphism of F 22 n/F 2 *». It follows that #C(F 22 n) = 
1 (mod 2"+^). Write ^2 = 2^" -I- 1 — ffC(F 22 n). Then O is divisible by 4 and from 
af = t 2 + 2 o 2 it follows that 02 is even. □ 

If the curve C is actually defined over F 2 then Theorem 2 implies that the 
curve is supersingular. In the general case we need a further argument. 

Proof, (of Theorem 4) Using Lemma 2 we see that P{X) = X^ (mod 2). By a 
result of Manin [22] (also see Stichtenoth [32] Satz 1) it follows that Jac(C')(F 2 ") 
has no points of order 2. In the case of dimension 2, this condition is known (see 
Li and Oort [20] p. 9) to be equivalent to supersingularity. □ 

An alternative proof of the above result can be given by using the theory of 
the Newton polygon and some class field theory. One shows that, in genus 2, the 
only polynomials P{X) which satisfy the condition of Lemma 2 also satisfy the 
condition of Theorem 2 (see Riick [25] for details of this approach) . 

Note that both of these arguments rely heavily on the fact that we are in the 
genus two case. In the case of genus three it is possible to give ‘safe’ examples. 
For instance, the curve C : y^ + y = x"^ of [27] has P{X) = A® — 2X^ + 2^ and 
the fact that 03 is not divisible by 2^^/^^ means that C is not supersingular. 

We note that #C'(F 2 ) and ffC(F 22 ) being odd does not alone imply that C 
is supersingular. An example is the genus two curve y^ + (x^ -I- a: -I- l)y = x^ + 1 
which has 3 points over F 2 and 7 points over F 22 and so P{X) = X^ + X^ + 4 
and C is not supersingular. 

The authors of [27] could have considered curves of the form y^ + xy = f{x) 
(with degree five f{x) G F 2 [x]). In these cases it is clear that #C(F 2 ») is always 
even, in which case a\ is always odd and, by Theorem 2 the curve cannot be 
supersingular. Indeed, the same argument shows that curves of the form y^+xy = 
f{x) with f{x) G F 2 *» [x] of odd degree are an infinite family of non-supersingular 
hyperelliptic curves. It is easy to find suitable examples of genus 2 curves of this 
form, for instance C : y^ + xy = x^ + x"^ + I has P{X) = X^ — X^ — 2X + 4. 
One can show that 

#Jac(C)(F297) = 2 • 389 • 1747- 

18473392463868826910318794676754071940716909907019619 
#Jac(C)(F2io3) = 2 • 47381- 

1085287719049570327739050925845914539948927360923370110769 
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where the large numbers are proven primes according to Magma. In both cases 
the Frey-Riick embedding degree exceeds 10®°. 

The above arguments suggest that, in characteristic two, only curves of the 
form + h{x)y = f{x) where deg(/i(x)) > 1 should be used in cryptography. 
However, this is not necessarily the conclusion one wants to draw, since equations 
of the form y^ + y = f{x) give some implementation efficiency (see Smart [30] 
Section 1 and [7] Theorem 14). 

Another strategy would be to use genus two curves of the form y^ + h{x)y = 
f{x) over F 2 « which always have two points at infinity (i.e., deg(ft.(x)) = 3 such 
that h{x) has no root in the ground field). In these cases one also has oi odd, 
and so the curves are not supersingular. 
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Appendix 1. Methods to Compute P{X) 

Very recently there have been some breakthroughs [15], [13] in algorithms for 
counting points and computing P{X) on higher genus curves in the case of 
small characteristic. Nevertheless there is still interest in using subfield curves. 
We discuss some methods to compute P{X) for curves C defined over small 
fields ¥q. 

First we give the most elementary method. Given a curve C/¥q of genus 
g > 1 compute #C(Fgr) for 1 < r < g by exhaustive search. If the curve is 
given as a non-singular plane curve f{x, y) = 0 with a known number of rational 
points at infinity then the exhaustive search involves trying all values a;o G F^r 
and then calculating the number of roots of f{xo,y) in F^r. From the values 
tr = + I — #C'(Fqr) = can obtain the coefficients of P{X) using 

Newton’s identities a™ = aitm-i) (see Cohen [5] Proposition 

4.3.3). This naive algorithm takes time 0{q^ (logq^Y) for some constant c, which 
can also be written as 0{q^~^Y- 

One method to speed this up is to compute Y^C{¥qr) for r = 1, . . . — 1 

and then to try all values of #C(Fgg) — (g® -|- 1) (i.e., all integers in the interval 
[—2gq^^‘^,2gq^^‘^]) and test the correctness of the group order probabilistically 
by computations on Jac(C') over F^ or over some extension F^m. This produces 
a method of complexity 0{q^~^~^Y- 

A variation on the above strategy is to use the method of Stein and Teske 
[31] which computes #Jac(C)(Fg) in time proportional to q'^ where d G Z is 
a suitable rounding of {2g — l)/5. One computes Y^C{¥qr) for r = 1, g — 1 
and then computes #Jac(C')(Fg) from which it is possible to deduce P{X). This 
method also has complexity 0{q^~^^Y- 

Similarly, one can compute =YC{¥qr) only up to r = g — 2 and then com- 
pute #Jac(C)(Fg) and #Jac(C)(F^ 2 ) using [31]. This method has the superior 
complexity when g = 4 or g > 6. This trick cannot be extended. 



Appendix 2. Superelliptic Curves 

The case of hyperelliptic curves has been fairly thoroughly explored in the past 
[16], [17], [3], [27], [30]. In particular, Buhler and Koblitz [3] mention cases which 
are guaranteed to be non-supersingular. 

A superelliptic curve (see [10]) is a curve given by an affine equation of 
the form y” = f{x) over Fg where gcd(n,< 7 ) = 1, gcd(n, deg /(a;)) = 1 and 
gcd(/(a;), /'(x)) = 1. Such curves have only one point at infinity and they have 
genus i(n - l)(deg/(x) - 1). 

Note that the curve y^ = f{x) over F 2 « has exactly 2" -|- 1 points when n is 
odd (since in those cases 3 is coprime to the order of F^n). This means that, in the 
case where the ground field is an odd degree extension of F 2 , to compute P{X) 
it is only necessary to count the number of points over even degree extensions 
of the ground field. In other words, when g is odd, one can compute P{X) in 
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time On the other hand, such curves do not have full 2-torsion and 

so they are not fully general among all superelliptic curves. 

Table 3 lists some non-supersingular superelliptic curves. In all cases the large 
numbers I are proven primes according to Magma, and the curves are resistant to 
the Frey-Riick attack. The symbol a represents a generator of the multiplicative 
group of the field of definition. As usual, one must be careful about the use of 
curves such as these due to the large automorphism group [6], [12]. 



Table 3. Examples of superelliptic curves suitable for cryptography. 



g — S C : y'^ = + ax^ -I- a; -I- a over F22 

P{X) = A® -t 3A^ -t 4A® -t 12A^ -t 2® 

#Jac(C)(F22.4i) = 2^ • 3 • 7 • 1231 • 12547 • 839353- 

103838175651664516641765501325467649197030008300761187148661 (197 bit) 
g — 3 C ■. y'^ = x"'‘ + x'^ ax + 1 over Fjs 

P{X) = A® -b 39A‘‘ + 1248A^ -b 2^® 

#Jac(C)(F25.23) = 2^ • 3A 5® • 7 • 11 • 83- 

249210979849057649603915759933900855778626741247624026770184646815 
70978869983922408175831537959 (314 bit) 

3 = 4 C : 3/® = a;® -b 1 over F2 

P(A) = A® - 2A^ -b 16 
#Jac(C)(F243) = 3 - 5 - 41 29- 

96654730063895670508796204430057604912608599311 (157 bit) 

g — 4: C : y® = a;® -b a; -b 1 over F2 

P(A) = A® -b 2A® -b 6A^ -b 8A^ -b 16 
3(tJac(C)(F243) = 3 • 11- 

181403354742656313080878192304365317354825710535649 (167 bit) 
#Jac(C)(F26i) = 3 • 11 • 12323- 

69516604910881473963537569029137158267066937810090081 
343111639513643 (226 bit) 





Short Signatures from the Weil Pairing 



Dan Boneh*, Ben Lynn, and Hovav Shacham 

Computer Science Department, Stanford University 
{dabo ,blynn,hovav}@cs . stanford.edu 



Abstract. We introduce a short signature scheme based on the Compu- 
tational Difiie-Hellman assumption on certain elliptic and hyper-elliptic 
curves. The signature length is half the size of a DSA signature for a 
similar level of security. Our short signature scheme is designed for sys- 
tems where signatures are typed in by a human or signatures are sent 
over a low-bandwidth channel. 



1 Introduction 

Short digital signatures are needed in environments where a human is asked to 
manually key in the signature. For example, product registration systems of- 
ten ask users to key in a signature provided on a CD label. More generally, 
short signatures are needed in low-bandwidth communication environments. For 
example, short signatures are needed when printing a signature on a postage 
stamp [21,19]. Currently, the two most frequently used signatures schemes, RSA 
and DSA, provide relatively long signatures compared to the security they pro- 
vide. For example, when one uses a 1024-bit modulus, RSA signatures are 1024 
bits long. Similarly, when one uses a 1024-bit modulus, standard DSA signatures 
are 320 bits long. Elliptic curve variants of DSA, such as ECDSA, are also 320 
bits long [1]. A 320-bit signature is too long to be keyed in by a human. 

We propose a signature scheme whose length is approximately 160 bits and 
provides a level of security similar to 320-bit DSA signatures. Our signature 
scheme is secure against existential forgery under a chosen message attack (in 
the random oracle model) assuming the Computational Difhe-Hellman problem 
(CDH) is hard on certain elliptic curves over a finite field of characteristic three. 
Generating a signature is a simple multiplication on the curve. Verifying the 
signature is done using a bilinear pairing on the curve. Our signature scheme 
inherently uses properties of elliptic curves. Consequently, there is no equivalent 
of our scheme in F*. 

Due to the properties of the curves we use, currently we can only provide 
signatures of the lengths given below. The best known algorithm for solving 
the CDH problem in these groups requires a discrete-log on a finite field of 
characteristic three. The size of this field is given (in bits) in the rightmost 
column of the table below. 

* Supported by NSF and the Packard Foundation. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 514-532, 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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Signature size 
(bits) 


EC group size 
(bits) 


Discrete-log Security 
(bits) 


126 


126 


752 


154 


151 


923 


237 


220 


1417 


259 


256 


1551 


265 


262 


1589 



The second row shows that we can get a signature of length 154 bits with security 
comparable to 320-bit DSA or 320-bit ECDSA. The best known algorithm to 
forge a 154-bit signature requires one to solve a CDH problem in a finite field 
of size 923 bits or on an elliptic curve group of size 151 bits. In Section 3.5 we 
outline an approach for generalizing our technique and building signatures of 
any length. 

Constructing short signatures is an old problem. Several proposals show how 
to shorten the DSA signature scheme while preserving the same level of security. 
Naccache and Stern [19] propose a variant of DSA where the signature length 
is approximately 240 bits. Mironov [18] suggests a DSA variant with a similar 
length and gives a concrete security analysis of the construction (in the ran- 
dom oracle model). Another technique proposed for reducing the DSA signature 
length is signatures with message recovery [21]. In such systems one encodes a 
part of the message into the signature thus shortening the total length of the 
message-signature pair. For long messages, one can then achieve a DSA signature 
overhead of length 160 bits. However, for very short messages (e.g., 64 bits) the 
total length is still 320 bits. Using our signature scheme, the signature length is 
always on the order of 160 bits, no matter how short the message is. Note that 
when the only transmitted data is the signature (the message is not transmitted) 
DSA signatures with message recovery are not any shorter than standard DSA 
signatures. 

Our signature scheme uses groups where the CDH problem is hard, but the 
Decision Diffie-Hellman problem (DDH) is easy. The first example of such groups 
was given in [12] and was previously used in [11,4]. We call such groups Gap 
Diffie-Hellman groups, or CDH groups for short. Okamoto and Pointcheval [20] 
commented that a Gap Diffie-Hellman group gives rise to a signature scheme. 
However, most Gap Diffie-Hellman groups are relatively long and do not lead to 
short signatures. We prove the security of signatures schemes derived from GDH 
groups and show how they lead to very short signatures. We experiment with 
our proposed signature scheme and give running times in Section 5. 



2 Signature Schemes Based on Gap-DifRe-Hellman 

We present a signature scheme that works in any Gap Diffie-Hellman group. As 
mentioned above, this scheme is described implicitly by Okamoto and Point- 
cheval [20] . The scheme resembles the undeniable signature scheme proposed by 
Chaum and Pederson [5] . In the next section we show how this signature scheme 
gives rise to very short signatures. 




516 



D. Boneh, B. Lynn, and H. Shacham 



2.1 Gap DifRe-Hellman Groups (GDH Groups) 

Consider a (multiplicative) cyclic group G = (g), with p = |G| a prime. We are 
interested in three problems on G. 

Group Action. Given u,v G G, find uv. 

Decision DifRe-Hellman. For a,b,cG Z*, given (g, g°) decide whether 
c = ab. 

Gomputational DifRe-Hellman. For a,b G l^p, given {g, (/“, p**), compute g°“^ . 
We define a Gap Diffie-Hellman group, in stages. 

Definition 1 . G is a T-decision group for Diffie-Hellman if the group action 
can be computed in one time unit, and Decision Diffie-Hellman can be computed 
on G in time at most r. 



Definition 2. The advantage of an algorithm A in solving the Computational 
Diffie-Hellman problem in a group G is 



AdvCDH. =^Pr 



A{g,g%g^)=g^‘’:a,b 



Where the probability is over the choice of a and b, and the coin tosses of A. 
We say that an algorithm A {t,e) -breaks Computational Diffie-Hellman in G if 
A runs in time at most t, and AdvCDH_4 > e. 

Definition 3. A prime order group G is a {r,t, e)-GDH group if it is a t- 
decision group for Diffie-Hellman and no algorithm (t,e)-breaks Computational 
Diffie-Hellman on it. 



2.2 The GDH Signature Scheme 

The GDH Signature Scheme allows the creation of signatures on arbitrary mes- 
sages m G {0, 1}*. A signature a is an element of G. The base group G and the 
generator g are system parameters. We denote by G* the set G* = G\{1} where 
1 is the identity of G. 

The signature scheme comprises three algorithms, KeyGen, Sign, and Verify. 
It makes use of a full-domain hash function h : {0, 1}* ^ G*. The security 
analysis views h as & random oracle [3]. In Section 3.3 we weaken the requirement 
on the full-domain hash. 

R, 

Key Generation. Pick random x ^ Z*, and compute v <— g^ . The public key 
is V. The secret key is x. 

Signing. Given a secret key x, and a message M G {0, 1}*, Compute h ^ h{M), 
and a ^ h^. The signature is cr G G*. 

Verification. Given a public key v, a message M, and a signature ct, compute 
h ^ h{M) and verify that {g,v,h, a) is a valid Diffie-Hellman tuple. 

Note that a GDH signature is a single element of G*. Hence, to construct 
short signatures we need a GDH group where elements have a short representa- 
tion. We construct such groups in Section 3. 
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2.3 Security 



We show the security of the GDH signature scheme against existential forgery, 
under chosen- message attacks. 

Definition 4. The advantage in existentially forging a signature of a forger 
algorithm T , given access to a signing oracle S, is 



AdvSigjr = Pr 



Verify{PK, M, cr) = valid 



{PK, SK) KeyGen, 

{M,a)^ T^{PK) 



The probability is taken over the coin tosses of the key-generation algorithm, and 
of the forger. 

Here the adversary iF is allowed to query the signing oracle adaptively: any 
of its queries may depend on previous answers, but it may not emit a signature 
for a message on which it had previously queried the oracle. The adversary also 
has access to the full-domain hash function, which is treated as a random oracle. 

Definition 5. A forger T (t, qn , qs F)~breaks a signature scheme if F runs in 
time at most t, makes at most qn queries to the hash function and at most qs 
queries to the signing oracle S, and AdvSig^p > e. 



Definition 6. A signature scheme is (t,qH,qsF)-secure against existential for- 
gery on adaptive chosen-message attacks if no forger (f,qH,qsF) -breaks it. 

The following theorem shows that the GDH signature scheme is secure. The 
proof of the theorem is given in Section 4. 

Theorem. Let G be a {T,t' , e')-gap group for Diffie- Heilman of order p. Then 
the Gap Signature Scheme on G is (t, qn , qs f) secure against existential forgery 
on adaptive chosen-message attacks, where 

t <t' -2cj^{\gp){qH qs) and e>2e-qse', 

and c_A is a small constant. Here e is the base of the natural logarithm. 

3 Building Gap-DifRe-Hellman Groups with Small 
Representations 



Using the Weil pairing, certain elliptic curves may be used as GDH groups. We 
recall some necessary facts about elliptic curves (see, e.g., [14,22]), and then 
show how to use certain curves for GDH signatures. In particular, we describe 
the curves -I- 2a; ± 1 over F 3 ^. 
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3.1 Elliptic Curves and the Weil Pairing 

An elliptic curve can serve as the basis for a GDH signature scheme if we can 
use it to construct some group G with large prime order on which Computa- 
tional Diffie-Hellman is difficult, but Decision Diffie-Hellman is easy. First, we 
characterize a necessary condition for CDH intractability on a subgroup of E. 

Definition 7. Let p he a prime, I a positive exponent, and E an elliptic curve 
over Fpi with m points. Let P in E he a point of prime order q where \ m. We 
say that the subgroup (P) has a security multiplier a, for some integer a > 0, if 
the order of p^ in F* is a. Ln other words: 

<7 I — 1 and q\p’’^ — l for all k = 1,2, ... ,a — I 

It is well known (as shown below) that for CDH to be hard in the subgroup 
(P) we must have that the security multiplier, a, for this subgroup is not too 
small. On the other hand, to get an efficient Decision Diffie-Hellman algorithm 
in (P) we need that a is not too large. Therefore, the problem in constructing 
short signatures is to find curves for which a is sufficiently large for security, 
but sufficiently small for efficiency. Using current security parameters, a = 6 is 
sufficient for obtaining short signatures. It is an open problem to build elliptic 
curves with slightly higher a, say a = 10 (see Section 3.5). 

Discrete- log on elliptic curves: Let (P) be a subgroup of P/F^i of order q 
with security multiplier a.. We briefly discuss two standard ways for computing 
discrete- log in (P). 

1. MOV: Use an efficiently computable homomorphism, as in the Menezes- 

Okamoto-Vanstone reduction [15], to map the discrete log problem in (P) 
to a discrete log problem in some extension of F^i , say FpH . We require that 
the image of (P) under this homomorphism is a subgroup of F*^ of order 
q. Thus we have q\{p^^ ~ 1)> which by the definition of a implies that i > a. 
Hence, the MOV method can, at best, reduce the discrete log problem in 
(P) to a discrete log problem in a subgroup of F*,^,- Therefore, to ensure 
that discrete log is hard in (P) we want curves with large a. 

2. Generic: Generic discrete log algorithms such as the Baby-Step-Giant-Step 

and Pollard’s Rho method [16] have a running time proportional to 
Therefore, we must ensure that q is sufficiently large. 

Decision Diffie-Hellman on elliptic curves: Let P G E/¥pi be a point of 
prime order q. Suppose the subgroup (P) has security multiplier a. We assume 
q\ p’’ — l. A result of Balasubramanian and Koblitz [2] shows that E/¥pia contains 
a point Q that is linearly independent of P. Such a point Q G E/¥pia can be 
efficiently found. Note that linear independence of P and Q can be verified via 
the Weil pairing described below. 

With two linearly independent points P G E/¥pi and Q G E/¥pia, each of 
order q, we can use the Weil pairing to answer certain questions that will allow 
us to construct a DDH oracle [12]. Let E[q] denote the subgroup of E/¥pia 
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generated by P and Q. The Weil pairing is a map e : E[q] x E[q] F*,^, with 
the following properties: 

1. Identity: for all R G E[q], e{R,R) = 1. 

2. Bilinear: for all G E[q] and a, 6 G Z we have that e(ai?i,6i?2) = 

3. Non-degenerate: if for R G E[q] we have e{R, R') = 1 for all R' G E\q\, then 
R=0. 

4. Computable: for all i?i,i?2 G E[q], the pairing e(i?i,i?2) can be computed 
efficiently [17]. 

Note that e{Ri, R 2 ) = I if and only if Ri and R 2 are linearly dependent. 

For the linearly independent points P and Q, both of order q, the Weil pairing 
allows us to determine whether the tuple {P, aP, Q, bQ) is such that a=b mod g; 
indeed, 

a = bmodq e{P, bQ) = e{aP, Q) . 

Suppose we also have a computable isomorphism <j) from (P) to {Q). Necessarily, 
(j) is such that, for all a, (j){aP) = axQ, where xQ = ^(P). In this case, the Weil 
pairing allows us to determine whether the tuple (P, aP, bP, cP) is such that 
ab = c mod q: 

ab = cmodq e{P, (j){cP)) = e{aP, <p{bP)) . 

With the isomorphism (j>, the Weil pairing provides an algorithm for Decision 
Diffie-Hellman. Note that the algorithm for DDH requires two evaluations of the 
Weil pairing for points over Fpic . 



3.2 A Special Supersingular Curve 



Using the machinery of Section 3.1, we derive GDH groups with small repre- 
sentation from the supersingular elliptic curves E given by -I- 2a; ± 1 

over F31. As we will see, these are unique supersingular elliptic curves with se- 
curity multiplier 6. Hence, the MOV reduction maps the discrete log problem in 
P/F3i to F*6, . This means that we can use relatively small values of I to obtain 
short signatures, but the security is dependent on a discrete log problem in a 
large finite field. We use two simple lemmas to describe the behavior of these 
curves (see also [23,13]). 

Lemma 1. The curve P+ defined by if = x^ + 2x + 1 over F3! satisfies 



#P+/F3‘ 



3* -I- 1 -I- V3 • 3* when / = ±1 mod 12, and 
3* -I- 1 — -^3 • 3* when I = ±5 mod 12 



The curve E defined by = x^ + 2x — 1 over F31 satisfies 



fE~/¥^‘ 



3* -I- 1 — V3 • 3* when I = ±1 mod 12, and 
3* -I- 1 -I- V3 • 3* when I = ±5 mod 12 



Proof. See [13, section 2]. 



□ 
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We have thus shown how to construct an elliptic curve with 3* + 1 ± Vs • 3* 
points over F31, simply by selecting one of E~ and as appropriate, whenever 

1 mod 12 equals ±1 or ±5. 

Lemma 2. Let E be an elliptic curve defined by + 2x ± 1 over F 3 i, 

where I mod 12 equals ±1 or ±5. Then ff{E/'¥^i) divides 3®* — 1. 

Proof. We have x® — 1 = (a;® — l)(x® + 1) = {x— l){x^ + x+l){x+l){x^ —x + \), 
so for any integer x it follows that — x + 1) | (x® — 1) . In particular, when 
X = 3^ we see that (3®* — 3^ + 1) | (3®* — 1). Now when E is an elliptic curve as 
above, we know that ff{Ef¥^i) is either 3* + 1 + -\/3 • 3* or 3* + 1 — VS • But 
((3' + 1) + VEILS') ((3' + 1) - ViTs*) = 32'-3' + 1. Thus #(F;/F30 | (3®'-l). 

Together, Lemmas 1 and 2 show that, for the relevant values of I, the curves 
i?+/F3i and E~/¥^i will have security parameters a at most 6 (more specifically: 
a I 6). Whether the security parameter actually is 6 for a particular prime 
subgroup of a curve must be determined by computation. 

Automorphism of E~^ , E~ /F^ei : For I such that I mod 12 equals ±1 or ±5, 
compute three elements of F36i, u, r+, and r~ , satisfying = —1, (r+)® + 2r+ + 

2 = 0, and (r“)® + 2r~ —2 = 0. Now consider the following maps over F36i: 

^“''(x, y) = (— X + r®", My) and (f>~ {x,y) = {—x + r~ ,uy) 

Lemma 3. Let I mod 12 equal ±1 or ±5. Then 4>~^ is an automorphism of 
E^/F^ei and 4>~ is an automorphism of E~/¥^ei. Moreover, if P is a point 
of order q on E^/¥^i (or on E~/¥^i) then 4>~^{P) (or 4>~{P)) is a point of 
order q that is linearly independent of P. 

Proof. See Silverman [22, p. 326]. □ 

For a point P of order q on any of these curves, the appropriate automorphism 
allows us to solve a Decision Diffie-Hellman question on G = (P), as we have 
shown in the previous section. 

3.3 Hashing onto Elliptic Curves 

The GDH signature scheme needs a hash function h : {0,1}* ^ G* where G 
is a GDH group. We are proposing to use a subgroup of an elliptic curve as a 
GDH group. Since it is difficult to build hash functions that hash directly onto 
a subgroup of an elliptic curve we slightly relax the hashing requirement. 

Let E/¥pi be an elliptic curve of order m defined by = /(x). Let P G E/¥pi 
be a point of prime order q, where q^ { m. We wish to use the subgroup G = (P) 
as a GDH group for the GDH signature scheme. Suppose we are given a hash 
function h' : {0,1}* ^ F^i x {0,1}. Such hash functions h' can be built from 
standard cryptographic hash functions. The security analysis will view h' as a 
random oracle. We use the following deterministic algorithm called MapToGroup 
to hash messages in {0, 1}* onto G*. Fix a small parameter I = [log2 log2(l/<5)] , 
where <5 is some desired bound on the probability of failure. 
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MapToGroupi^,: The algorithm defines h : {0, 1}* ^ G* as follows: 

1. Given M G {0, 1}*, set i ^ 0; 

2. Set (x, b) ^ h'{i || M) G F^i x {0, 1}; 

3. If f{x) is a quadratic residue in F^i then do: 

3a. Let j/o,yi G Fpi be the two square roots of f{x). We use b G {0, 1} to 
choose between these roots. View po, Vi polynomials of degree I — 1 
over Fp. Then ensure that the constant term of j/o is not greater than the 
constant term of yi when viewed as integers in [0,p] (swapping yg and yi 
if necessary). Set Pm G if /Fpi to be the point Pm = (x,yb)- 
3b. Compute Pm = {‘m/q)PM- Then Pm is in G. 

If Pm is in G* then output Map To Group b>{M) = Pm and stop. 

4. Otherwise, increment i, and goto Step 2; If i reaches 2^, report failure. 

The failure probability can be made arbitrarily small by picking an appropriately 
large I. For each i, the probability that h' {i || M) leads to a point on G* 
is approximately 1/2 (where the probability is over the choice of the random 
oracle h'). Hence, the expected number of calls to h' is approximately 2, and the 
probability that a given message M will be found unhashable is 1/2^ < 6. 

Lemma 4. Suppose the GDH signature scheme is (t,qH,qsp)-secure in the 
subgroup G when using a random hash function h : {0,1}* ^ G* . Then it 
is {t — 2^ qHlgrn,qH,qsp)-secure when the hash function h is computed with 
MapToGroupM where h' is a random hash function h' : {0, 1}* ^ Fpi x {0,1}. 

Proof Sketch: Suppose a forger algorithm T' (t, qH,qs, e)-breaks the Gap Signa- 
ture Scheme on the subgroup G when the hash function h is computed using 
MapToGroupb' ■ We construct an algorithm T that {t + 2^ qHlgm,qH,qsj^)- 
breaks the scheme when h is a random oracle h : {0, 1}* ^ G* . 

Our new forger T will run T' as a black box. T will use its own hash oracle 
h : {0, 1}* ^ G* to simulate for F' the behavior of M ap To Group ■ It uses an 
array Sij, of elements of Fpi x {0, 1}. The array has qn rows and 2^ columns. On 
initialization, F fills with uniformly-selected elements of Fpi x {0,1}. 

F then runs F' , and keeps track (and indexes) all the unique messages Mi 
for which F' requests an h' hash. When F' asks for an h' hash of a message 
w II Mi whose Mi F had not previously seen (and whose w is an arbitrary /-bit 
string), F scans the row s^, 0 < j <2^. For each (x, b) = Sij, F follows Step 3 
of MapToGroup, above, seeking points in G*. For the smallest j for which Sij 
maps into G*, F replaces Sij with a different point (xi,bi) defined as follows. 
Let Qi = h{Mi) G G* . Then F constructs a random Qi = (xi,yi) G if /Fpi such 
that {m/q)Qi = Qi. It sets Sij = (xi,bi) where bi G {0,1} is set so that (xi,bi) 
maps to Qi in Step 3a of MapToGroup. Then MapToGroup^' {Mi) = h{Mi) as 
required. 

Once this preliminary patching has been completed, F is able to answer h' 
hash queries by F' for strings w' || Mi by simply returning Siw' . The simulated h' 
which F' sees is statistically indistinguishable from that in the real attack. Thus, 
if F' succeeds in breaking the signature scheme using MapToGroupb' then IF, 
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in running T' while consulting h, succeeds with the same likelihood, and suffers 
only a running-time penalty from maintaining the additional information and 
running the exponentiation in Step 3 of MapToGroup. □ 

3.4 A Concrete Short Signature Scheme 

To summarize things so far, we describe a concrete signature scheme using the 
GDH group derived from the curve E/¥^i defined by = x^+2xzLl. Some useful 
instantiations of these curves are presented in Table 1. Note that we restrict these 
instantiations to those where I is prime, to avoid Weil-descent attacks [9,10]. As 
explained in Section 3.3, we use MapToGroupj^, to map arbitrary bit strings 
to points of order q on E, using a hash function E from arbitrary strings to 
elements of F^i and an extra bit. 



Table 1. Supersingular elliptic curves for GDH Signatures. Here m = and 

q is the largest prime dividing m. The MOV reduction maps the curve onto a field 
with X elements. 



curve 


1 


Signature Size 

[lg2 H 


DLog Security 
[lg2 <l] 


Multiplier 

a 


MOV Security 
[lg2 *1 


E~ 


79 


126 


126 


6 


752 


E+ 


97 


154 


151 


6 


923 


E+ 


149 


237 


220 


6 


1417 


E+ 


163 


259 


256 


6 


1551 


E~ 


163 


259 


259 


6 


1551 


E+ 


167 


265 


262 


6 


1589 



A concrete signature scheme: 

Key generation. Given one of the values I in Table 1, let E/¥^i be the cor- 
responding curve and let q be the largest prime factor of the order of the 
curve. Let P G E/¥^i he a, point of order q. pick a random x G Z* and set 
R ^ xP. Then {I, q, P, R) is the public key and x is the private key. 

Signing. To sign a message M G {0, 1}* use algorithm MapToGroupj^, to map 
M to a point Pm G (P). Set Sm ^ xPm- The signature a is the x coordinate 
of Sm- Therefore, cr G F 3 i. 

Verification. Given a public key {l,q,P,R), a message M, and a signature a 
do: 

1. Find a point S G P/F 3 i of order q whose x-coordinate is a and whose 
y-coordinate is y for some y G F 31 . If no such point exists reject the 
signature as invalid. 

2. Set u ^ e{P,(j){S)) and v ^ e{R, (j){h{M))) , where e is the Weil pairing 
on the curve P/F 361 and <f> : E E is the automorphism of the curve 
described in Lemma 3. 

3. If either u = v or u~^ = v, accept the signature. Otherwise, reject. 
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Note that both {a,y) and {a,—y) are points on -E/F 3 ! that have cr as their 
x-coordinate. Either one of these two points can be the point Sm used to gen- 
erate the signature in the signing algorithm. Indeed, since (a,y) = —{a,—y) 
on the curve, we have that e{P,(j){—S)) = . Therefore, u = v 

tests that {P, R,h{M), S) is a Diffie-Hellman tuple, while u~^ = v tests that 
(P, R, h{M), —S) is a Diffie-Hellman tuple. 

The next lemma shows that an attacker capable of existential forgery under 
a chosen message attack (in the random oracle model) is also capable of solving 
the Diffie-Hellman problem in P/F 31 . 

Lemma 5 . Suppose P/F31 is one of the curves given in Table 1, q is the largest 
prime dividing ffE, P is a point of order q on E, and no algorithm (to,€o)~ 
breaks Computational Diffie-Hellman on G = (P). Let E : {0, 1}* ^ F 3 i x {0, 1} 
be a random oracle. Then the concrete signature scheme described above is 
{t,qH,qsp) -secure against existential forgery on adaptive chosen-message at- 
tacks (in the random oracle model), where 

t<to-2cj^{\gq){qH + qs)-‘^^(lH^grn-2T and e>2e-qseo, 
and c_A is a small constant. 

Proof. By assumption, G is a (r, toj £o)“GDH group, where t is equal to twice 
the time necessary to compute the Weil pairing on G. Assuming the existence of 
a random oracle h from arbitrary bit strings to G*, the generic GDH signature 
scheme (given in Section 2.2) on G is (G, < 7 //, gs, ei)-secure against existential 
forgery on adaptive chosen-message attacks by the main theorem (Section 4), 
where 

ti<to- 2cA{lgq){qH + qs) and ei > 2e • qseo, (*) 

and is a small constant. 

By Section 3.3, we can construct a hash function h onto G* from the hash 
function h' . By Lemma 4, the generic GDH signature scheme on G, using al- 
gorithm MapToGroupi^, is (^ 2 , j C 2 )-secure against existential forgery on 
adaptive chosen-message attacks by the main theorem (Section 4), where 

t 2 = ti — 2^ qjjlgm and C 2 = G- (**) 

The only difference between the generic GDH signature scheme on G and the 
concrete scheme on G described above is that signatures in the latter scheme are 
elements of F 3 i, rather than G. Given an adversary T that breaks the concrete 
scheme, we can construct an algorithm A that breaks the generic scheme, as 
follows. The public key is identical in the two schemes, so A simply provides F 
with the R given to it. Hashes are identical in the two schemes, so A passes iF’s 
hash requests to its own hash oracle, and provides F with the answer. When F 
requests a signature on a message M, A obtains the signature S G E from its 
signature oracle, and gives F the x-coordinate cr of S. Finally, when F outputs 
a forgery cr* (for the concrete scheme) on a message M* , A finds a point S* G E 
whose x-coordinate is cr*. By the discussion above, either {P,R,h{M*),S*) is 
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a Diffie-Hellman tuple, in which case S* is a signature on M* in the concrete 
scheme, or (P, R, h{M*), —S*) is a Diffie-Hellman tuple, in which case —S* is a 
signature on M* in the concrete scheme. A outputs M* along with the appro- 
priate one of S* and —S*. 

The additional time required for this simulation is dominated by the two addi- 
tional signature verifications, each of which takes time t. Thus if the generic GDH 
scheme is (^2, <?s, £2)-secure, the concrete GDH scheme is (ts, 9s, £3)- 

secure, where 

ts = t2 — 2t and £3 = £2. (***) 

Gombining (*), (**), and (***) yields the required reduction. □ 

3.5 An Open Problem: Short Signatures with High Security 

In the previous section we proposed using a supersingular curve over F*f to build 
a short signature scheme as secure as discrete log in F*g^. However, there is no 
reason to stick with supersingular curves. Using other elliptic or hyper-elliptic 
curves it might be possible to achieve even higher security multipliers. 

In Section 3.2, we showed that the curves P+ and E~ over F3i have security 
parameter a at most 6. This is, in fact, the maximum value of a for any super- 
singular curve [15,23]. Instantiating the GDH signature scheme on (necessarily 
non-supersingular) elliptic curves with slightly higher values of a would increase 
the work required for verification, but also increase security against MOV-related 
attacks at comparable signature bit lengths. 

Gonsider an elliptic curve E/¥pi with m points, a large prime 9 j m, a security 
parameter a for the subgroup of order 9, and two linearly independent points, 
P and Q, of order 9, where P G E/¥pi, and Q G E/¥pia. Note that a point 
Q G E/¥pia linearly independent of P must exist by [2] assuming 9 — 1. For 

such a curve, there is not necessarily an automorphism that maps between {P) 
and (Q) . We therefore slightly modify the Gap Signature Scheme to use the two 
groups together. 

It is easy to decide whether a tuple (P, aP, Q, bQ) is such that a = b, using 
the Weil pairing. We call this the co-Decision Diffie-Hellman problem, and it has 
an obvious computational variant: given the tuple (P, Q, aQ), compute aP. The 
modified (co-gap) signature scheme is as follows. 

Key Generation. Let P G E/¥pi and Q G E/¥pia be two linearly independent 

points of prime order 9 as described above. Pick x ^ Z*, and compute 
R ^ xQ. The public key is (P/Fpi, 9, Q, R). The secret key is x. 

Signing. Given a secret key x, and a message M G {0, 1}* use MapToGroupj^, 
to map M to a point Pm G (P). Set Sm ^ xPm- The signature a is the 
x-coordinate of S'm, an element of F^i. 

Verification. Given a public key (P/F^i , 9, Q, P), a message M, and a pur- 
ported signature cr, let S' be a point on E/¥pi of order 9 whose x-coordinate 
is a and whose y-coordinate is y for some y G F^i (if no such point exists 
reject the signature as invalid). Set u ^ e{Q,S) and v ^ e{R,h{M)). If 
either u = v or u~^ = v, accept the signature. Otherwise, reject. 
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By reasoning analogous to that in Section 3.4, the tests in the verification 
phase ensure that either (Q, i?, h{M),S) or (Q, R, h{M),—S) is a valid co-Difiie- 
Hellman tuple. While the public key, R, is an element of E/Fpia, and thus long, 
a signature a is an element of E/Fpi, and thus relatively short. The security 
of this scheme follows from the assumption that no adversary (t, e)-breaks the 
co-Computational Diffie-Hellman problem. 

The challenge, therefore, is to construct elliptic curves with larger values of a, 
say a = 10. It is currently an open problem to build a family of elliptic curves 
with security multiplier a = 10. 

Galbraith [8] constructs supersingular curves of higher genus with a “large” 
security multiplier. For example, the supersingular curve ip' -V y = -V has 
security multiplier 12 over F 2 i. Since a point on the Jacobian of this curve of 
genus two is characterized by two values in F 2 i (the two x-coordinates in a 
reduced divisor) the length of the signature is 21 bits. Hence, we might obtain a 
signature of length 21 with security of computing CDH in the finite field F 2121 . 
This factor of 6 between the length of the signature and the degree of the finite 
field is the same as in the elliptic curve case. Hence, this genus 2 curve does not 
improve the security of the signature, but does give more variety in signature 
lengths beyond those given in Table 1. Since this curve is defined over a field 
of characteristic two it is better suited for computation than curves defined 
over of fields of characteristic three. Galbraith shows that Jacobians of genus 2 
supersingular curves have a maximum security multiplier of 12. Therefore, genus 
2 supersingular curves will not give short signature with higher security. It is an 
open problem whether one can build a family of hyper-elliptic curves of genus 3 
that would give short signatures with higher security. 



4 Proof of Security Theorem 

We prove, in the random oracle model, that GDH signatures are secure in GDH 
groups. The proof is similar to that given for full-domain hash RSA signatures 
by Goron [6], but the presentation is different. The point of this method is that 
the break-probability e for the signature scheme does not depend on the number 
of hash queries a forger makes, but only depends on the number of signature 
queries made by the adversary. 

Theorem (Gap Signature Security). If G is a (r, t' , e')-GDH group, then the 
Gap Signature Scheme on G is (t,qH,qsj^) secure against existential forgery on 
adaptive chosen-message attacks, where 

t <t' — 2TCj^{qH + qs) emd e > 2e • qse' , 

and is a small constant (in practice, at most 2). 

The proof follows, in stages. 
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4.1 Overview 

Assume an algorithm T (t, g//, gg, e)-breaks the Gap Signature Scheme on G. 
We will use T to construct an algorithm A that (r, breaks Computational 
Diffie-Hellman on G, where t' and d are as above. 

Given a forger T for the GDH group G, we build an algorithm A that uses 
T to break GDH on G. A is given a challenge (g, g°',g^). It uses this challenge to 
construct a public key that it provides to T . It then allows T to run. At times, 
T makes queries to two oracles, one for message hashes and one for message 
signatures. These oracles are puppets of A, which it manipulates in constructive 
ways. Finally, if all goes well, the forgery which T outputs is transformed by A 
into an answer to the GDH challenge. 

We assume that T is well-behaved in the sense that it always requests the 
hash of a message M before it requests a signature for M, and that it always 
requests a hash of the message M* that it outputs as its forgery. It is trivial to 
modify any forger algorithm T to have this property. 

A needs to engage in a certain amount of bookkeeping. In particular, it must 
maintain a list of the messages on which T requests hashes or signatures. Each 
message M, as it arrives from T , is assigned an index i; i is obviously bounded 
above by qn- The message is stored in Mi, its hash in hi, and its signature (if 
available) in ai. 



4.2 Construction of A 

Rather than describe A’s behavior and prove its efficacy in toto, we will construct 
A in a series of “games,” in which increasingly sophisticated A- variants run iF; 
the final variant, Ae, is the A we seek. 

(Each of the A-variants will depend on a probability constant C, which will 
be optimized later, to yield the best possible reduction. Define to be the 
probability distribution over {0, 1} where 1 is drawn with probability <C, and 0 
with probability 1 — C.) 

Game 1. Ai is given a challenge (g,g“,g^). In setup, it constructs PK ^ (g“). 

R. 

Then, for each i, 1 < i < Qh, A\ picks a random bit Si ^ Bq, and a random 
R, 

number n ^ Z*. It then sets hi ^ and Ui ^ (g“)”b Note that (g, g“, hi, ai) 
is a valid Diffie-Hellman tuple, so is a signature on any message whose hash 
is hi- Ai then runs T with public key PK. 

When T requests a hash on a message Mi, Ai responds with hf, when T 
requests a signature on a message Mi, Ai responds with ai. 

Finally, A halts, either conceding failure or returning a a forged signature 
(M*,a*), where M* = Mi* for some i* (on which T had not requested a sig- 
nature). If T succeeds in forging, Ai outputs “success”; otherwise, it outputs 
“failure”. 

The hashes hi are uniformly distributed in G, so Ai’s hash oracle is a random 
oracle. Moreover, the signatures ai are all valid. In the random oracle model. 
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therefore, T, when run by A\, behaves exactly as it would when running on its 
own. Thus 



Adv^j 



Pr 






success : a, 



K. 



= Pr 



Verify{PK, M* , a*) = valid 



{PK, SK) KeyGen, 
^ T{PK) 



e, 



where the first probability is taken over the coin tosses of A\ and IF, and over 
the choices of a and b. Since a is chosen uniformly from Z*, g°', the public key 
Ai provides J-, is uniformly distributed in G. 



Game 2. A2 functions as does Ai, with a single exception. If T fails, A2 outputs 
“failure”; if T succeeds, outputting a forgery where i* is the index 

of M* , then A2 outputs “success” if = 1 , but “failure” if Sj* = 0 . 

Clearly, T can get no information about any s^, so its behavior cannot depend 
on their values. Thus the final trip test A2 performs is independent of the game 
to that point. Thus we have 

Adv^2 = Adv^j • Pr [si* = 1 ] = Cc 
since each Si is drawn from 

Game 3. A3 functions as does A2, but, again, with a modification. If T fails 
to create a forgery, ^3 also fails. If F succeeds in finding a forgery on Mi*, A 
claims success only if s^- = 1, and T asked for signatures only on messages Mi 
for which Sj = 0. 

Again, T can get no information about any Si. Each of its signature requests 
can cause A to declare failure at the game’s end, with probability C, but it cannot 
know, during the game, whether any of them did. The Si’s are independent, so 
each of T's signature requests is an independent trial insofar as disqualification 
by Si is concerned. Moreover, is independent of any s/s for which T requests 
signatures, so the test that Si* equals 1 is again an independent trial, and the 
analysis of Game 2 is not affected. 

The probability of T's not being disqualified because of any particular signa- 
ture request is 1 — C. If IF makes k signature oracle queries, where k necessarily 
is at most qs, and if, moreover, it makes those queries on the messages with 
indices i\,. . . ,ik, then 

Adv^3 - Adv^, • Pr [s, . = 0 , j = 1 , . . . , ft] = Ce • (1 - 0 " > (1 - O^'Ce- 

Game 4- A4 functions as does A3, except that, if T requests a signature on a 
message Mi for which Si = 1, A declares failure and halts immediately. 

We may fully describe a run of A by fixing the challenge, M’s random bits, 
and iF’s random bits; these collectively determine the value of each Sj, and the 
indices on which requests signatures. Let us call unlucky any runs in which 
T requests a signature on some Mi for which Sj = 1. A3 would already declare 
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failure on any unlucky runs: if T declares failure, ^3 does also; if T finds a 
forgery, A3 fails anyway because of the unlucky signature query. Thus A3 and 
A4 will agree (with output “failure”) on all unlucky runs; they will also agree 
on all lucky runs, since the modification of A4 relative to A3 is not invoked on 
those runs. Thus we have 

Adv^, = Adv^3 > (1 - C)«®Ce- 

The immediate halt in unlucky runs is a shortcut and does not affect the outcome 
distribution. 

Game 5 . A5 is based on A4. In the setup phase, for each i, if Si = 1, A5 sets 
hi ^ ■ g^' and di ^ a placeholder value; if Si = 0, it sets hi <— g^' and 

CTi ^ as before. 

G is a cyclic group of prime order, so multiplication by any element of G, and 
5** in particular, induces a permutation on G. Thus if r is uniformly distributed 
in Z*, and g^ ■g'' have identical, uniform distributions in G. T cannot learn any 
information about the s^’s from examining the ft-i’s it is given. A5 is unable to 
provide signatures on messages for which Sj = 1, but that is unimportant, since 
any runs in which IF asks for such a signature are failed immediately. Therefore, 
T will behave under A5 exactly as it does under A4, and 

Adv^3 = Adv^, > (1 - C)«®Ce- 

Game 6. Aq behaves as does FI5. In those games where A5 outputs “success”, 
however, Aq outputs “success” and, in addition, outputs a*/{g°‘Y'"‘i where i* 
is the index of the message M* for which T output a forged signature a* . {Ae, 
like the Fl’s before it, only succeeds when IF succeeds.) 

Clearly, Ae succeeds with precisely the same probability as A5, so 

Adv^3 = Adv^3 > (1 - C)«®Ce- 

Moreover, Ae only succeeds if Si» = 1, which means that hi* = g^ . If a* is a 
valid signature on M* = Mi * , then (g,g‘^,hi*,a*) must be a valid Diffie-Hellman 
tuple, so a* must equal hf* = g°‘^ ■ {g^^'Y- Thus, in every instance on which Ae 
claims to succeed, it also outputs cr* /{g°'Y'’‘ = which is indeed the answer 
to the Diffie-Hellman challenge posed to it. 

4.3 Optimization and Conclusion 

The algorithm Ae thus uses the GDH-signature forger IF to solve CDH chal- 
lenges. What remains is to optimize the parameter C to achieve a maximal prob- 
ability of success. The function (1 — (^)‘?®^e is maximized at C = l/((75-|-l), where 
it has the value 

<7S + 1 V + 1 / qs \ + 1 / 




Short Signatures from the Weil Pairing 529 



(The latter equality follows from taking partial fractions.) Now ^’s success prob- 
ability e' is at least as great as this. For large qs, (1 — l/(<7s + « 1/e. 

^’s running time includes the running time of T . The additional overhead 
imposed by A is dominated by the need to evaluate group exponentiation for 
each signature and hash request from T . Any one such exponentiation may be 
computed by using at most 21gp group actions, and thus at most 21gp time 
units, on G (see [16]). A may need to answer as many as g// -I- qs such requests, 
so its overall running time is t' < t + 2cj\{lgp){qH + qs), Where is a small 
constant that accounts for the remainder of A’s administrative overhead; in 
practice, c_4 should be at most 2. 

To summarize: if there exists a forger algorithm T that (t, g//, (/s, e)-breaks 
the GDH signature scheme on G, then there exists an algorithm A that (t', e')- 
breaks CDH on G, where 

1 / 1 \ 

<' = f-k2c^(lgp)((7H-kgs) and e' = — • 1 — • e. 

gs V + 1 / 

Conversely, if G is a (r, t', e')-GDH group, then there can exist no algorithm T 
that (t, g//, e)-breaks the GDH signature scheme, where 



t = t' - 2c^(lgp)(g// -k gs) and 





1 

+ 1 



9S-I-1 



For all positive qs, the radicand in the latter equation is greater than l/2e, so 
the equation may be rewritten as e < • 2e. This completes the proof. 



5 Experimental Results 

5.1 Implementation Details 

We experimented with the scheme of Section 3.4. Recall that signing is a single 
multiplication on the curve + 2 x ± 1 over F3i . Verifying a signature 

requires two Weil pairing computations over F36i. Hence, verifying takes more 
time than signing. 

For efficiency, rather than working in F36! directly (which involves manipu- 
lating polynomials of degree 6/), we work with extensions of F36 of degree 1 . To 
speed up arithmetic in F36 we construct lookup tables (of size 3®) for quickly 
multiplying two elements. Elements of F36 are represented by their exponent 
relative to a chosen generator, so that multiplication and division corresponds 
to addition modulo 3® — 1. Addition is done using a multiplication, division and 
table lookup via the identity a + b = a(l + a~^b). The constants r+, r~ ,u used 
in the automorphism (j) also lie in F36 and can be quickly found by a brute force 
search. 

We map an element a in F3i to an element of F361 using the obvious injection: 
a is represented by a polynomial of degree I with coefficients in F3, and we simply 
view it as a polynomial with coefficients in F361. 
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We use the Tate pairing [7] instead of the Weil pairing, since it has similar 
properties and is easier to compute: the Weil pairing requires two iterations of 
Miller’s algorithm [17] and one division while the Tate pairing needs only one 
call to Miller’s algorithm and an additional exponentiation. 

Because Miller’s algorithm involves the computation of various quotients, sev- 
eral divisions can be avoided since we may scale the numerator and denominator 
by arbitrary constants. We used sliding windows for every exponentiation-like 
operation, that is, exponentiation in F 36 i, Miller’s algorithm, and multiplication 
of a point on the curve. Point multiplication can be sped up further by using 
signed sliding windows, converting to weighted projective coordinates (though 
this may not help; it depends on the implementation of the field operations), and 
taking advantage of the fact that some points are fixed for the whole system. 

Recall that the output of the Tate pairing is a coset representative in F*g, . 
Signature verification then consists of checking that the output of two Tate 
pairings lie in the same coset. This could be done by finding the quotient of 
the outputs, and raising it to the appropriate power (and comparing with the 
identity element). However, we can replace the division with a multiplication by 
exploiting the bilinearity of the Tate pairing: dividing by e{A,B) is equivalent 
to multiplying by e{A,—B) = l/e{A,B) {—B can be easily computed from B 
by negating the j/-coordinate). 

The x-coordinate is an element of F 3 i and is represented as a polynomial of 
degree at most I — I with coefficients in F 3 . For output, it is viewed as a number in 
base 3, and then encoded in base-64. For I = 97, which has 923-bit discrete-log se- 
curity, an example signature looks as follows: “KrpIcV0D9CJ8iyBS8MyVkNrMyE” . 
This is under half the size of the standard 320-bit DSS signature (with 1024-bit 
discrete security). 

5.2 Running Times 

The following table shows the time required to verify a signature. Recall that a 
verification is much more expensive than signature generation because it requires 
computing two pairings. The program was run on a 1 GHz Pentium HI computer 
running GNU/Linux. 
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When using elliptic curves to get short GDH signatures we are forced to 
use a curve over a field of characteristic three. This slows down arithmetic on 
the curve. It is possible that the running times above can be improved using 
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higher genus curves over fields of characteristic two as discussed at the end of 
Section 3.5. Similarly, the techniques of [13] for computing on the curves E'^ 
and E~ over F 3 i may slightly improve these numbers. 

6 Conclusions 

We presented a short signature based on the Weil pairing. The length of a signa- 
ture is one element of a finite field. Standard signatures based on discrete log such 
as DSA require two elements. When working with the curve = x^ + 2xzL 1 over 
F 3 ! the MOV attack maps the CDH problem in this curve to a CDH problem in 
F 36 ! . Hence, we can use small values of I to obtain short signatures with security 
comparable to the security of 320-bit DSA. For example, we obtain a signature of 
length 154 bits where breaking the scheme reduces to solving the Diffie-Hellman 
problem in a finite field of size approximately 2®^^. In Section 3.5 we outlined an 
open problem that would enable us to get even better security while maintaining 
the same length signatures. We hope future work on constructing elliptic curves 
or higher genus curves will help in solving this problem. 

Acknowledgments. The authors thank Steven Galbraith, Alice Silverberg, 
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Abstract. We describe two simple, efficient and effective credential 
pseudonymous certificate systems, which also support anonymity with- 
out the need for a trusted third party. The secoud system provides cryp- 
tographic protection against the forgery and transfer of credentials. Both 
systems are based on a new paradigm, called self-blindable certificates. 
Such certificates can be constructed using the Weil pairing in supersin- 
gular elliptic curves. 



1 Introduction 

Credential pseudonymous certificates (CPCs) were introduced by David Chaum 
[7] in 1985 to counter some of the privacy problems related to identity certificates. 
One such problem is that service providers know exactly who they are servicing 
when a user employs an identity certificate, which for some applications is not 
required, acceptable or even permissible. Moreover, by combing their logs, service 
providers can piece together a record of all the user’s activities. 

A pseudonym is a unique identifier (string) by which a user is known by a cer- 
tain party; typically each party knows the same user by a different pseudonym. 
These pseudonyms can be references to a user’s identity known only by desig- 
nated parties, or can be completely anonymous, (i.e., known only to the user). 
Unlike Chaum [7], we do not limit a ‘physical’ user to only one pseudonym with 
a given provider. We believe that for some types of providers, e.g., on-line, sub- 
scription based, information providers, the use of many different pseudonyms for 
one physical user, without the provider knowing, can be considered an important 
feature. However, we do discuss how, if necessary, such unique pseudonyms can 
be supported by our systems. 

A pseudonymous certificate binds a user’s pseudonym to their public key, the 
private key to which the user possesses. Such certificates are issued by a trust 
provider. Identities, pseudonyms and public keys should be unique. A credential 
is a trust provider’s statement about the user which is relied upon by other 
parties, who we simply call service providers. Examples of such statements are 
properties such as “lives in Amsterdam”, qualifications such as “has a PhD in 
math”, or rights such as “can access this secure room”. A credential can be 
single-use, such as a prescription, or multiple-use such as a driver’s license. In 
this paper we focus on the latter type of credentials. 

C. Boyd (Ed.): ASIACRYPT 2001, LNCS 2248, pp. 533-551, 2001. 
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Finally, credential pseudonymous certificates (CPCs) are digital certificates 
that bind credentials to users, known by a pseudonym. Proof of credential pos- 
session is given by proving possession of the private key related to the public 
key referenced in the certificate. Several credentials may be bound to a single 
pseudonymous certificate and, thus, pseudonym. 

In Chaum’s model, pseudonyms are unlinkable: parties that know a user 
by different pseudonyms must not have the ability to combine their logs to 
assemble a dossier on the user.^ Another requirement in Chaum’s model is that 
CPCs must be translatable: a CPC issued under pseudonym A must be usable 
under pseudonym B. For example, a user may be given a credential asserting his 
good health from a doctor under pseudonym A, and show this to its insurance 
company who knows it by pseudonym B. In addition to these two requirements, 
the system should fulfill the following three basic security requirements: 

Protection against pseudonym/credential forgery. It should not be pos- 
sible for outsiders, malicious users, or other parties involved to generate 
(credential) pseudonymous certificates without the consent of the relevant 
trust providers. 

Protection against pseudonym/credential sharing. A user could be 
tempted to share its credentials (e.g., a season pass for public transport) 
with another user. It should therefore be very difficult or awkward for a 
user to do so.^ One potential solution to this problem would be to store 
credentials on tamper resistant devices that are valuable to the user (e.g., 
smartcard based passports). A better solution would be an all-or-nothing 
concept for credentials: sharing a credential effectively implies sharing a cre- 
dential that is highly valuable to the user, most notably one enabling him to 
take over the user’s identity and digitally sign contracts that legally binds 
the user (cf., [6], [5]). 

Revocation of pseudonymous certificates and credentials. Under cer- 
tain circumstances, it should be possible for the user and trust providers 
to revoke pseudonymous certificates as well as credentials bound to them. 
This could be case, for instance, if a user lost secret (key) information or 
changes jobs. 

CPCs such as those described above, counter the privacy problems of iden- 
tity certificates to some extent, but not completely. Indeed, in that setting, all 
user’s activities with a provider are related to a pseudonym, so that the provider 
can link the user’s activities with the fixed pseudonym. If the user’s identity 

^ Unlinkability and pseudonymity of credentials are sometimes difficult to enforce 
simultaneously in practice. Indeed, even if they are anonymous, credentials implicitly 
narrow down the number of possible users possessing them. To illustrate, how many 
people have both a degree in cryptography (credential number one) and Swedish 
citizenship (credential number two)? 

^ Perhaps complete eradication of credential sharing would be impossible in the vir- 
tual world, as the end user might give away everything he knows (passwords) or has 
(smartcards), leaving only the identification factor “what the user is” (e.g., biomet- 
rics) to counter credential sharing. 
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is compromised, then so are its activities. To prevent this potential problem, 
a CPC system should preferably support that users can easily and regularly 
change pseudonyms. A CPC system should also ensure that the translation of 
credentials includes as few (trusted) parties as possible. In our CPC system, 
users themselves can both change pseudonyms and translate their credentials. 

We remark that, in the above text, we implicitly define the parties as users, 
trust providers (providing credentials and pseudonyms to users) and service 
providers (relying on credentials and pseudonyms), which we use in the remain- 
der of this paper without further explanation. 

The goal of this paper is to describe a very simple, effective and efficient CPC 
system that meets the basic requirements of a CPC system and that is based on 
the new paradigm of self-blindable certificates. With this type of certificates the 
user can, e.g.: 

— generate its own new pseudonymous certificates itself (to which it possesses 
the private key) based on a valid pseudonymous certificate; and 

— translate and combine CPCs issued under one pseudonym to another pseudo- 
nym, including a one-time-use pseudonym. 



1.1 Related Work 

As we could probably write an entire paper just discussing and comparing all of 
the CPC schemes that have been published, we will be brief. The first scheme 
was introduced by Chaum and Evertse [10] and is based on having a semi-trusted 
third party involved in all credential translations. Both from an efficiency and a 
security point of view, this is undesirable. Chen’s scheme [12], envisions a trusted 
party who, amongst other things, should be trusted to refrain from transferring 
credentials between different users. Damgard’s scheme [13], is based on general 
complexity-theoretic primitives and is therefore not applicable for practical use. 
The scheme developed by Lysyanskaya, Rivest, Sahai and Wolf [19] is based on 
one-way functions and general zero-knowledge proofs which also makes it inap- 
propriate for practical use. Our CPC system can be considered as the opposite 
of the credential scheme [6] constructed by Camenisch and Lysyanskaya, which 
in effect issues one secret CPC for each trust provider; the scheme’s properties 
of anonymity and untraceablity arise from the zero-knowledge protocols that 
confirm that a user indeed has such a certificate without revealing it. Although 
the scheme [6] appears to be of practical use, it is based on rather complex 
(zero-knowledge) protocols. Our scheme and the required proofs of knowledge 
are basic (Schnorr and Okamoto). Finally, we mention the work of Brands [5], 
which deals with the related subject of privacy protecting attribute certificates. 
In our system, the user itself can translate or combine credentials received from 
different trust providers without the interaction of any trusted party, generating 
a new certificate. This is an important distinction from Brands’ scheme [5] when 
applied to the special case of a credential certificate system. As a final note, we 
remark that the privacy of our scheme can be further improved by the use of 
“Wallet with Observer” techniques, cf., [5], [11]. 
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Outline of the Paper 

— In Section 2, we describe a variant of the Chaum-Pedersen digital signature 
scheme which is of crucial importance for our constructions of self-blindable 
certificates. 

— In Section 3, we provide a functional description of our model for CPSs. 

— In Section 4.1, we present the first technical construction of our model, which 
assumes that secret key information is stored on tamper-proof devices to 
provide resistance to credential transfer. 

— In Section 4.2, we present the second technical construction of our model, 
which is more resistant to the transfer of credentials, without requiring the 
use of tamper-proof devices. The transfer of any credential in this construc- 
tion to another person will actually result in the transfer of a very valuable 
signing key, e.g., one enabling the holder to sign legally binding contracts in 
the user’s name. 

— In Section 5, we summarize our results. 



2 A Proofless Variant of the Chaum-Pedersen Signature 
Scheme 

A digital signature s formed by an entity is a data string, based on a private key 
under control of the entity, that associates a message m (in digital form) to enable 
a proof that it originates from the entity and that it has not been changed. If the 
actual message comprises a public key plus some optional additional attributes, 
then (m, s) is called a certificate and the entity issuing it is called a Certification 
Authority (CA). 

In this section, we describe a digital signature scheme that enables a CA 
to issue certificates that are “self-blindable”. This will be explained further in 
Section 3. The digital signature scheme is based on the Chaum-Pedersen signa- 
ture scheme (cf., [11]). The setting of our scheme is not standard but is based 
on a group, G, of prime order q, with generator g, in which the Decision Diffie- 
Hellman problem is simple, while the discrete logarithm and the Diffie-Hellman 
problems are practically intractable. In the section below, we further explain 
these notions and indicate how such groups can be constructed. In Section 2.2, 
we describe our digital signature scheme and its properties. 



2.1 Groups in which the DDH Problem Is Simple and DH, DL Are 
Hard 

Recall, that the Diffie-Hellman (DH) problem with respect to a generator g 
of a group G of (prime) order q, is the problem of computing the values of the 
function DHg{g^, g^) = g^^ . Two other problems are related to the DH problem. 
The first one is the Decision Diffie-Hellman (DDH) problem with respect to g\ 
given a,b,c € G decide whether c = DHg(a, b) or not. An alternative formulation 
of the Decision Diffie-Hellman problem is: given a quadruple g,g^,h,h^ in the 
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group G decide whether x = y. The second problem related to the DH problem, 
is the discrete logarithm (DL) problem in G with respect to g\ given a = g^ € G, 
with 0 < X < q, find x = DL{a). The DL problem is at least as difficult as the DH 
problem. It is widely assumed that if the DL problem G is hard, then so is the DH 
problem. Currently, cf. [16], [27], [17], a large class of groups has been discovered 
in which the DDH problem is simple, while the Diffie-Hellman and discrete 
logarithm problems are presumably not. This class consists of certain groups of 
points on supersingular elliptic curves in which setting the DDH problem can 
be efficiently computed (in polynomial time i.e., in polynomial time and space 
in length of input) by using the so-called Weil pairing. 

As an illustration of such groups and techniques, consider the curve Ga '■ 

= x^ + a with p = 2 mod 3 and a any non-zero element in GF(p). Then, 
the Frobenius trace over GF(p) is equal to 0 (hence the curve is supersingular) 
and the number of points on the curve in GF(p) is equal to p -I- 1. Moreover, 
as p = 2 mod 3, the equation x^ = 1 only has solutions in GF(p^) other than 
X = 1; let w be such a solution. Now, if (P) is a group of points of (prime) 
order q on the curve in GF(p) (i.e., q divides p -I- 1) and A, B, G is an instance 
of the DDH problem with respect to P. Then G = DHp{A, B) if and only if 
eq{A, D{B)) = eq{P, D{C)), where D{.) is the endomorphism (called a distortion 
map in [27]) on Ga that maps a point {x,y) on the curve to the point (w • x,y) 
also on the curve (over GF(p^)) and where Cg(., .) is the so-called Weil pairing. 
See [1], [20] or [26]. As the Weil pairing is efficiently computable, the DDH 
problem is also efficiently computable in this situation. It is well-known that the 
DL problem in the group of points on the curve in GF(p) reduces to the DL 
problem in a subgroup of order q in GF(p^)* (cf. [21] ). That is, to make the 
DH and DL problems practically intractable against attacks known today, the 
length of the prime number q should be at least 160 bit and the length of the 
prime number should be at least 512 bits. 

A practical construction of a group in which the DDH problem is efficiently 
computable and the DH and DL problems are presumably not, is as follows. 
Ghoose a 512 bit prime number p of type p = Qq — 1 where q is also a prime 
number and consider the curve Gi : y^ = x^ + 1. Let P be any GF (p)-rational 
point on the curve of order q. This construction is used in [2] in the setting of 
an identity-based encryption scheme that is also based on the Weil pairing. This 
paper also analyzes the work needed to solve the DDH problem in the group 
(P), which amounts to a small number of multiplications on the curve. 

These techniques generalize to groups of points on supersingular elliptic 
curves over a finite field, say F, and the work required to compute the DDH 
problem is asymptotically bounded by 0(fc^ log(||P||) bit operations, i.e., the 
complexity of calculating a Weil pairing. The parameter k is the so-called MOV 
degree (cf. [21]) and is equal to either 1, 2, 3, 4 or 6 in the setting of supersingular 
curves. 

We end this section with two remarks for later reference. A group of points, G, 
on a supersingular elliptic curves has the property that there exists an efficiently 
computable embedding, i.e., an injective homomorphism, of the group in a second 
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group G' where all three of the DDH, DH and the DL problems are believed to 
be hard. Indeed, this embedding is given by the MOV embedding (cf. [21]) and 
the second group, G', is a subgroup of the multiplicative group of a finite field. 
It is shown in [27] that inverting such embeddings is hard; in fact, as hard as 
the DH problem in the group G. Note that by using a specific choice of G, the 
group G' could be the XTR group. Compare [18] and [27]. A group of points on a 
(supersingular) elliptic curve over a finite field used in cryptography is typically 
chosen in such a way that its order is a prime number times a small number (e.g., 
6 in the example above) . This means that choosing provable random elements in 
the subgroup without knowledge of relative discrete logarithms is very simple, 
e.g., by mapping a hash value into a point on the curve and then mapping it to 
a point in the subgroup. See also [3]. 

2.2 The ‘Proofless’ Variant of the Chaum-Pedersen Scheme 

As explained in the previous section, we consider a group, G, of prime order q, 
with generator g, in which the DDH problem is simple, while the discrete log- 
arithm and the Diffie-Hellman problems are practically intractable. The public 
key of a participant in the Chaum-Pedersen scheme takes the form y = where 
0 < cc < <7 is the participant’s randomly chosen private key. A signature on a 
message m G G in the original Chaum-Pedersen scheme, consists of z = plus 
a proof that logg(?/) = log,„(z). Resolving the latter problem is just an instance 
of the Decision Diffie-Hellman (DDH) problem with respect to g. Indeed, one 
can easily verify that logg(i/) = log^(z) if and only if z = DHg{m, y). That is, if 
one applies the Chaum-Pedersen scheme to the group G, one is not required to 
send along an explicit proof that logg{y) = log^(z), as anyone can validate that 
themselves. Or, in other words, the signature on a message to G G only consists 
of an element z = of the group G, without the additional proof of knowledge. 
This is the variant of the Chaum-Pedersen scheme that we use in our schemes. 
It follows that by choosing a group of points on a supersingular elliptic curve 
of MOV degree 6 (cf. [21] and the previous section), the representation of the 
element z requires only 1024/6 « 171 bits to obtain a security level comparable 
with 1024 bit RSA (with respect to attacks known today). See [3], where it is 
also shown that the above digital signature scheme is secure in the random oracle 
model. 

An interesting property of this variant is that it is self-blindable: it enables 
easy randomization without losing the verification property and without requir- 
ing knowledge of the signing key z. Indeed, given the signed message to,to^, 
then by choosing a randomizing factor, k, it can be transformed into to*,to^^. 
This property becomes useful when the message to has a property that is inher- 
ited by TO^, e.g., knowledge of a certain discrete logarithm, and is explored in 
the following sections. Another interesting property of this variant (as pointed 
out to us by Stefan Brands) , is its easy blinding property, cf. [7] . When a party 
wants to obtain a blind signature on a message (typically a hash), M, from a 
signing party with public key g^ in our variant of the Chaum-Pedersen, it asks 
the signing party to sign M’’, for a random 0 < r < q, resulting in The 
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user can deduce from this using r and verify that it is a correct signature 
on M, which is publicly verifiable. We will delve no further into this property in 
this paper. 

In the terminology we introduced above, we formulate the security assump- 
tion that we require for our variant of the Chaum-Pedersen scheme (cf. [8], [9]). 



Assumption 21 If the Diffie- Heilman problem with respect to g is 
hard, then without knowledge of the private signing key z, the only 
forged message an attacker can make on the basis of signed messages 
{mi, ml), {m 2 , m|), . . . , (m„, m^) with respect to the public key g^ is of the form 
iYj=imir,{9"° n"=i m\^)^), for any integers io,ii, ■■ - in, i-e., a power prod- 
uct of the signed messages. 

3 Our Functional Model for CPCs 

In this section, we describe our functional model for CPCs. To this end, we 
first formulate the requirements for self-blindable pseudonymous certificates and 
credentials based upon them. Then we explain how these elements can be used 
to build a CPC system. 

3.1 Self-Blindable Certificates 

In this section we introduce the notion of self-blindable certificates, which is of 
crucial importance for our schemes. Our introduction is somewhat informal, but 
can be made formal without much effort. 

We assume that one public key crypto system is employed by all users and 
we denote the collection of all possible user public keys by U. We also assume 
that one signing public key crypto system is employed by all trust providers for 
certificate issuance. For simplicity’s sake, we also assume that certificate signing 
is deterministic, i.e., there is only one possible valid certificate on a fixed public 
key, plus optional fields. We let T denote the collection of possible verification 
public keys of trust providers. Our description of a credential on a user public 
key Pjj G hi from a trust provider with public verification key Pt takes the form 

{Pu, Sig{Pu, S't)}, 

where St stands for the private signing key of the trust provider relating to Pt- 
This certificate is typically accompanied by a higher-level certificate 

Cert{Pij, “Trust statement” ) 

on the public verification key Pt- We do not further elaborate on this, but this 
certificate can be thought of as a standard X.509 certificate with the “Trust 
statement” in one of its extension fields. We denote the collection of all possible 
certificates by C. 
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The certificates are called self-blindable, provided there exists a set called 
transformation factor space F and an efficiently computable transformation map 
D-.C X F ^ C with the following properties: 

1. For any certificate C G C and f G F the certificate D{C,f) is signed with 
the same trust provider public key as C. 

2. Let Cl, C 2 be certificates and f G F known. If C 2 = D{Ci, /) then one can 
efficiently compute a transformation factor f G F such that Ci = D{C 2 , f')- 

3. If Cl, C 2 GC are two different certificates on the same user public key, then 
so are D{C\, f) and D{C 2 , /)• That is, the mapping D{., .) induces a mapping 
U X F and although abusive, we also use the notation D{Pu, f) for any 
user public key Pu and transformation factor /. 

4. Let Pjj be a user public key and let / € C be a known transformation 
factor. Then, a user possesses the private key relating to Pjj if and only if it 
possesses the private key relating to D{Pjj,f). 

5. If the user’s public key Pu G U is fixed and if / S C is a uniformly random 
element in F, then D{Pjj, f) is a uniformly random element in lA. 

We briefly explain the rationale behind these properties. The first property 
enables one to transform a user certificate into another one from the same cer- 
tificate authority; the fourth property ensures that the user still has possession 
of the private key referenced in the transformed certificate provided he knows 
the transformation factor. The fifth property states that all user public keys are 
equally possible in the transformed certificate. As we will explain below, a user 
typically collects credentials on different certificates formed as transformations 
of one fixed certificate. Now, the second property enables to invert transforma- 
tions, allowing to translate all credentials to the fixed certificate and then to 
other certificates. Finally, the third property is technical and in fact emerged 
from our constructions. We have chosen it as part of our formal definition, as it 
enables simple proofs and formulation of other properties, e.g., properties four 
and five. More complicated requirements are possible to arrive at a more general 
notion of self-blindable certificates, but we will not explore this. 

3.2 A CPC System Based on the Building Blocks 

We use the terminology introduced above and we assume that the certificates are 
self-blindable. Our notion of a pseudonymous credential is the simplest possible 
and takes the form 

{Pu, [Sig{Pu, SN),Cert{Pff, “PP statement”)]}, 

where Pu stands for the public key of the user (with related private key Su)- 
Moreover, Sig{Pu, Sn) is a signature on the user’s public key with a signing key 
of the pseudonym provider (PP) and Cert{Pff, “PP statement” ) is a (conven- 
tional) certificate on the public verification key of the pseudonym provider, with 
a statement on its applicability included among the usual fields (e.g., expiration 
date). For evident reasons, this PP certificate must be used by the pseudonym 
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provider for many users to prevent linkage of the issued pseudonymous certifi- 
cate. Also note that the pseudonym of a user is in fact the user’s public key in its 
certificate, which is reminiscent of the SPKI (Simple Public Key Infrastructure) 
approach, cf. [24]. 

Note that the self-blinding properties of the certificates enable the users them- 
selves to generate a new pseudonymous certificate validly signed by the same PP, 
by choosing a (random) factor and transforming an initially issued pseudony- 
mous certificate. 

Our description of a CPC is based upon that of a pseudonymous credential, 
say {Pu, [Sig{Pu, SN),Cert{PN, “PP statement”)]} and its simplest form is: 

{Pu, [Sig{Pu, SN),Cert{PN, “PP statement”)], 

[Sig{Pu, Sc),Cert{Pc, “CP statement”)]}. 

Here, [Sig{Pu, Sc),Cert{Pc, “CP statement”)] is called the credential field. In 
this, Sig{Pu,Su) is a signature on the public key of the owner with a signing 
key Sc of the credential provider (CP). Also, Cert{Pc, “CP statement”) is a 
(conventional) certificate on the related credential provider’s public verification 
key, that has a statement on its credential applicability, e.g., “the person having 
possession of the private key is over 18 years old” included among the usual 
fields (e.g., expiration date). In a natural fashion one can have several credential 
fields attached to a pseudonymous credential in the above way, which is in fact 
the general form of a CPC. 

Based on the building blocks explained above, one can now construct a wide 
variety of types of CPC systems. We provide a high-level description of one such 
system on which many variations are possible (cf. Figure 1). 

System description 31 

Initial Registration. The user registers, typically in a non-anonymous fashion, 
with a pseudonym provider. After registration a First Pseudonymous Certifi- 
cate (FPC) issuing protocol between the user and the pseudonym provider 
is started. This protocol is system specific. The pseudonym provider puts 
the FPC in a public directory. When unique pseudonyms are required, the 
provider has the option to maintain a private list of physical persons that 
were issued a pseudonymous certificate; this ensures that at most one such 
certificate is issued to a physical person. 

Credential Issuance. By using a random transformation factor, the user 
transforms its FPC into a random pseudonymous certificates (RPC). The 
user securely stores the used transformation factor. Then the user registers 
with a credential provider using this RPC which includes a proof of posses- 
sion of the private key referenced in the RPC. This registration need not be 
anonymous. The user does what is required to obtain a credential (e.g., takes 
a driver’s exam, shows other credentials) and up-on succeeding, is issued a 
credential on the RPC, that is the CPC. The pseudonym provider has the 
option to put the CPC in a public directory. 




542 



E.R. Verheul 



Credential Use. The user registers (typically anonymously) with a service 
provider using a new RPC, which includes a proof of possession of the private 
key referenced in the new RPC. The user combines all of the CPCs relat- 
ing to credentials required by the service provider into one CPC under the 
registered pseudonym. This is possible by using the second property of self- 
blinding certificates on the transformation factors related with the individ- 
ual, original CPCs. That is, a CPC is first translated to the First Pseudonym 
and then translated to the registered pseudonym (in our constructions these 
two steps can be performed in one operation). This certificate is presented 
to the service provider, together with a proof of possession of the private key 
referenced in this CPC. Once the user is successful in doing so, he will be 
serviced. 

If the service provider wants to be certain that the user has not already been 
issued another pseudonym, the service provider has the option to require 
that the user contact a specific trust provider which we refer to as “unicity” 
provider. The user sends this trust provider the transformation factor(s), 
transforming the new RPC to the first issued pseudonymous certificate stored 
in the pseudonym provider’s directory (i.e., the FPC). This trust provider 
then validates that these factor (s) transform the RPC into a FPC on the PP’s 
directory, and that this FPC was not registered before. The trusted party 
then reports to the service provider that the user has not registered before. 
Note that the PP directory does not specify user identities, only FPCs, 
also note that the specific trust provider need not be the user’s pseudonym 
provider. 

In the system description above, we have used the FPC list of the pseudonym 
provider as the reference data for all trust providers that need to verify that a 
‘physical’ user cannot register twice (under different pseudonyms) with a service 
provider. This means that if two such trust providers conspire, they can link 
together the different pseudonyms of a user. One can prevent this linkage with a 
flexible secret sharing technique as follows. During registration, the pseudonym 
provider and the user, say U, exchange a secret, S. If a trust provider, say T, 
wants to provide assurance on unique pseudonyms, then provider A is provided 
a list consisting of transformed FPCs, in such a way that: 

— user U’s FPC is transformed using a transformation factor based on a secure 
hash of the name of the provider T and the secret S; and 

— the order of the FPCs is randomly permuted. 

If user U wants to assure the trust provider T that it is not registering twice 
(under different pseudonyms) with a service provider via T, then it provides the 
provider with the transform factor transforming the RPC (see above) into the 
transformed FPC stored at the provider T. This technique can be iterated: user 
U can (after proving possession of a transformed FPC at T) be issued another 
secret by T, and the transformed FPC can be re-transformed by T and stored 
at another trust provider T2, etc.. By combining transformation factors, user 
U can employ provider T2A service without any interference from provider T. 
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Moreover, in such a setting, linkage requires that all such trust providers and 
the pseudonym provider conspire. 

In Figure 1 we have depicted the (five) steps from pseudonym issuance to 
CPC application in a sample voting application. The communication between 
the “unicity” provider and the service provider (the voting application) is not 
depicted. 
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Lives in 


in nTrti 


Amsterdam 


felloe 
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TPj: Local Government CPC 



I vote: 

□ Mr. W. Kok 
^ Mrs. M. V. Buuren 
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Signature 



Fig. 1. Overview of system description 31 



3.3 Revocation of Certificate Bases 

As users typically will not present the originally issued certificates to service 
providers, certificates cannot be revoked in the conventional way. A primary 
concern is that the revocation process should not make it possible to link cre- 
dential use, except, possibly, by certain trusted parties. 

There are several methods to address revocation in our model, but we out- 
line only two. The first method is pro-active, and consists of letting the trust 
providers employ signing keys with a short expiration time (e.g., a week). If a 
pseudonymous certificate or a credential relating to such a certificate has not 
been revoked, then the trust provider automatically updates the certificates or 
credentials in its directory with newly signed ones. A user can collect the up- 
dated pseudonymous certificates and credentials, preferably via an anonymous 
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channel to reduce the chances of linkage. To achieve this, the user can, for exam- 
ple, collect many certificates, including the required ones. By revoking its FPC, 
the user can effectively revoke all credentials based on it. 

The second method for revocation we outline consists of sending along spe- 
cific transformation factors with a (credential) pseudonymous certificate, to a 
specific trust provider. This trust provider can then retrieve the original issued 
(credential) pseudonymous certificates and find out if they have been revoked. 
The trust provider then provides a statement on the status of the (credential) 
pseudonymous certificate to the service provider. This functionality resembles 
the use of an On-line Certificate Status Protocol (OCSP) request, commonly 
used on the Internet (cf. [23]). Of course, the service provider still needs to 
verify that the user is in possession of the private key referenced in the used 
randomized CPC. 

The second revocation technique can be supplemented with the flexible secret 
sharing technique described at the end of the previous section. 

4 Constructions for Credential Pseudonymous 
Certificates 

4.1 A Simple Construction 

In this section we describe an initial and very simple construction for self- 
blindable certificates and thus CPCs. We describe this scheme merely for pur- 
poses of illustration, as it has the serious inherent draw-back of not supporting 
cryptographic protection against users sharing credentials. Therefore, to imple- 
ment this construction one would need to trust devices resistant to user tamper 
to prevent users from sharing credentials. As the construction in Section 4.2 pro- 
vides cryptographic protection against users sharing credentials this construction 
is favorable to the scheme presented in this section. 

Let G = (g) be a group of prime order q in which the DDH problem is effi- 
ciently computable, while the discrete logarithm and the Diffie-Hellman problems 
are practically intractable. We also assume that the (provable) random genera- 
tion of elements in G without knowing any relative discrete logarithms is also 
possible (see the end of Section 2.1). The description of the group G, including 
the g, q are considered as system parameters. 

The set T of all trust provider’s public keys takes the form j,j® where 0 < 
s < <7 is the related private key and where j e G \ {1}. We assume that each 
trust provider’s public generator j is (provably) randomly chosen, e.g., it could 
be based on the output of a secure hash algorithm with a fixed input. The set of 
users public keys G consists of elements of the form where 0 < x < q are all 
possible user’s private keys. There is a subtle reason why a: = 0 is principally not 
allowed, see below. Note that a user can prove possession of a; in a zero-knowledge 
fashion with the Schnorr identification protocol [25]. Moreover, several digital 
signature systems can be based on the user public, private key pairs mentioned 
above, e.g., DSA [15], ElGamal [14] and Schnorr [25]. Finally, a certificate issued 
by a trust provider with public key h, on a user public key g^ takes the form: 
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Note that the above certificate is based on the variant of the Chaum-Pedersen 
signature (as outlined in Section 2) on with respect to the public key h,h^, 

i.e., . An important feature of this variant is that it is not required to add an 

interactive proof that the second component indeed has the form g^^ as the DDH 
problem is assumed to be simple. Due to the restrictions on the first element in 
the certificate, it cannot be equal to the unity element. If this condition is not 
also checked by applications, then certificate forgery becomes simple. 

The certificates C constructed in this way are self-blindable. To this end, 
choose the transformation factor space F equal to GF(< 7 )* and define the trans- 
formation D : C X F ^ C as 

{{X,Y}J)^{Xf,Yf}. 

That is, the certificate {g^,g^^} is transformed to the certificate {g^^,g^^^} 
under factor /. It is a simple verification that D{., .) satisfies the five properties 
of a transformation and, thus, that the certificates constructed in this way are 
self-blindable. 

Notice that the transfer of credentials is simple in this construction, if the 
user is able to retrieve (and transfer) the private key related to the public key 
of a (transformed) pseudonymous certificate. This problem can be controlled by 
ensuring that all security operations with respect to credentials take place on 
a tamper resistant signing device in such a way that private key information 
of (transformed) certificates can be used (‘addressed’) but not retrieved. The 
use of such devices needs to be addressed in the FPC issuing protocol for these 
certificates, for instance as follows. 

1. The user registers, typically in a non-anonymous fashion, with a pseudonym 
provider. 

2. The pseudonym provider generates a random 0 < x < q, and forms the 
user public key g^ and the certificate {g^,g^^}- All information is put on a 
tamper resistant signing device, in such a way that private key information 
of (transformed) certificates can be used but not retrieved. 

3. The secure signing device is handed over to the user in a secure fashion. 

Having filled in this issuing protocol, our CPC scheme now follows system 
description 31. Protection against pseudonym/credential linking and pseudo- 
nym/credential translation are obvious consequences of the properties of self- 
blindable certificates. For the other two security properties (protection against 
forgery and transfer), one needs to trust devices resistant to user tampering. 

4.2 A More Robust Construction 

This construction is based on the technique in Brands’ e-cash scheme to trace 
double spenders (cf. from [4]). Just as in the previous section our construction 
is based on the variant of the Chaum-Pedersen signature scheme as introduced 
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in Section 2. So, again, let G = {g) be a group of prime order q in which the 
DDH problem is efficiently computable, while the discrete logarithm and the 
Diffie-Hellman problems are practically intractable. We also assume that the 
(provable) random generation of elements in G without knowing any relative 
discrete logarithms is also possible. In addition to this, we assume that there 
exists an efficiently computable embedding E(.) from G into a group G" where 
all three problems DDH, DH and DL are practically intractable. All these re- 
quirements are met by suitable groups of points on supersingular elliptic curves, 
cf. the end of Section 2.1. The description of the groups G, including the g,q, 
the group G' and the embedding are considered to be system parameters. 

As before, the set of all trust providers’ public keys, T takes the form j, j® 
where 0 < s < g is the related private key and where j G G \ {!}. We assume 
that each trust provider’s public generator j is (provably) randomly chosen, e.g., 
it could be based on the output of a secure hash algorithm with a fixed input. 
In addition we assume that the pseudonym provider publishes a certified pair 
(r, s) = (r, r-^) where r,s G G and for some 0 < f < q which is unknown by all 
parties. Generation of such a pair consists of choosing two (provable) random 
r, s which determines /. Alternatively, the pseudonym provider can choose the 
element r in a provable random fashion and generate a random element 0 < f < q 
and form s = . We prefer the first construction, for two reasons. First, it is 

difficult for the pseudonym provider to convince others that / has been chosen 
randomly and, second, it is good practice to have as few secret keys in a system 
as possible. 

The set of users public keys U consists of elements of the form 51 , 52 , ■ 
Here 0 < a;i, a ;2 < is the related private key, gi is a random generator and 
loggi(<? 2 ) = /• As in the previous scheme, we require that gi^g^^ be unequal 
to the unity element. Note that a participant can prove possession of xi,X 2 in 
a zero-knowledge fashion with the Okamoto variant of Schnorr’s identification 
protocol [22]. In the same paper, a variant of Schnorr’s signature scheme is 
described based on the user public, private key pairs mentioned above. Finally, 
a certificate issued by a trust provider with public key h, on a user’s public 
key 51 , 52 , 51 ^ 52 ^ takes the form: 

{5l,52,5^52^(5^52^)^}• 

Again, this is precisely the variant of the Chaum-Pedersen signature (as outlined 
in Section 2) on the user’s public key with respect to the public key h, h^. As 
the DDH problem is simple, on basis of the certified pair (r, r-^), anyone can and 
should verify that the first two parameters in the certificate are indeed correctly 
formed, i.e., the second one is an /-th power of the first one (cf. the alternative 
description of the DDH problem in Section 2.1). Due to the restrictions on the 
three elements in the certificate, none of them can be equal to the unity element. 
If this condition is not also checked by applications, then certificate forgery 
becomes simple. 

The certificates C constructed in this way are self-blindable. To this end, 
define the transformation factor space by F = GF(g)* x GF((j)* and the trans- 
formation D : C X F ^ C as: 
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({X, y, W, Z}, (k, 1 )) ^ 

That is, the certificate {gi, 92, 92^ , (91^9^)^} is transformed into the certifi- 
cate 



S J „xikl X2kl ( ^Xikl X2kl\z\ 

\ 9 i, 92 , 9 i 92 t\ 9 i 92 I S 

under the transformation factor (k,l). It is a simple verification that D{.,.) 
satisfies the five properties of a transformation. Notice that two transformation 
factors (k,l) are used to ensure that a randomly transformed public key is indeed 
a random element in the user’s public key space. 

The FPC issuing protocol for these certificates can be filled in as follows, but 
many variations are possible; the pseudonym provider’s public key is denoted as 
h,h^ , where /i€G\{l}is (provably) randomly chosen. 

1. The user registers, typically in a non-anonymous fashion, with a pseudonym 
provider. 

2. The pseudonym provider generates a random pair {91,92) such that 92 = g{, 
by choosing a (provably) random power of the elements r, s. The pair {91,92) 
is sent to the user, or to a party acting on its behalf (e.g., a smart card issuer). 

3. The user (or a party acting on its behalf), generates a random private key 
0 < X < q and forms 92 ■ The user sends 92 and proves possession of the 
private key x (i.e., the discrete logarithm with respect to 92 of the first sent 
public key), e.g., by using Schnorr’s protocol. 

4. Based on the elements 91, 92 and 92, the pseudonym provider forms the pub- 
lic key 91,92,9192, checks to ensure that the last element is unequal to the 
unity element and places a Chaum-Pedersen signature on it, i.e., {9192)^ ■ 
Moreover, the provider employs the embedding E : G G' and determines 
the elements E{g2), £{92) of the group G' (in which the DDH, DH and 
DL problems are hard). Next the provider determines a random power r 
of these elements, i.e., E{g2)^ , E{92)'~ ■ The provider then forms a conven- 
tional non-repudiation certificate (e.g., based on the US Digital Signature 
Algorithm) on (£^(52)’^, A(gf )’’). The first pseudonymous certificate and the 
non-repudiation certificate are issued to the user. Both are also stored in 
separate directories. 

Using the terminology of the above protocol; as the embedding E{.) is a 
homomorphism it directly follows that the private non-repudiation signing key 
is equal to x. We have used a non-repudiation signing key only as an example 
of a private key that is highly important to a user. Many more examples exist 
(e.g., the user’s signing key for financial transactions). 

There are two reasons why the user’s non-repudiation key is embedded in 
the group G' in the specified way. First of all, using a group where all three of 
the DDH, DH and DL problems are hard, seems appropriate for a conventional 
signature scheme. Second, embedding the non-repudiation key in the specified 
way, prevents linkage between the first pseudonymous certificate and the non- 
repudiation certificate. Should a party have access to 92, 9 ^ (whose E{.) images 
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appear in the non-repudiation key) then this party would be able to link this 
to the pair g2,92 the DDH problem in G is simple. However, inverting the 
embedding E{.) is hard (cf. the remarks at the end of Section 2.1), so invert- 
ing the values E{g 2 Y , E{g 2 Y (deducible from the non-repudiation certificate) 
is not a practical possibility. Moreover, as the DDH problem is presumed to 
be hard in G' it would be impossible to relate E{g 2 ),E{g 2 ) (deducible from 
the first pseudonymous certificate) to E{g 2 Y , E{gY)'' (deducible from the non- 
repudiation certificate) . Strictly speaking, such a linkage might not be an issue, 
as users will typically employ transformed pseudonymous credentials. However 
(cf. the generic description 31), this might become an issue should a service 
provider want to be certain that the user has not already been issued another 
pseudonym. Indeed, the user would then need to provide a trust provider with 
the transformation factor from its registered pseudonymous certificate to the 
First Pseudonymous Certificate. We finally note that, in the issuing protocol, 
the pseudonym can alternatively first calculate random r-powers of the elements 
92,92 the group G and then utilize the embedding E{.). For the same r, this 
would give the same result as with the method described above. 

Having filled in this issuing protocol, our CPC scheme now follows from 
the system description 31. Protection against pseudonym/credential linking and 
pseudonym/credential translation are obvious consequences of the properties of 
self-blindable certificates. We discuss the two other security properties. 

Protection against pseudonym/credential forgery 
This protection is based on an all-or-nothing concept (see the introduction). 
The private key in a transformed credential takes the form (k,k ■ x mod q) for 
some 0 < k < q. Note that dividing the second part by the first part yields the 
user’s non-repudiation key x. Hence, if the user transfers a credential, then it 
also transfers a copy of its non-repudiation signing key. We think that this is a 
sufficient deterrent to transferring credentials (which can be supplemented with 
the physical security of a signing device). 

Protection against pseudonym/credential forgery [Indication] 

Under Assumption 21, we provide a sketched proof in the appendix that an 
efficient pseudonym/credential forgery algorithm based on all issued certificates 
and private keys, will in fact provide an algorithm determining hard discrete 
logarithms with non-negligible probability. 



5 Conclusion 

We have described two simple, efficient and effective credential pseudonymous 
certificate systems, which also support anonymity without the need for a trusted 
third party. Both systems are based on a new paradigm, called self-blindable 
certificates. Such certificates were constructed using the Weil pairing in super- 
singular elliptic curves. The second system provides cryptographic protection 
against the forgery and transfer of credentials. 
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A Appendix: Forgery Protection in the Robust 
Construction 

Suppose that a total of n-number of certificates under one trust provider are 

issued, e.g., of type: 



{9i,i,g2,i,9iy92Y,igl 



i X2, 

92 






where the trust providers public key is of the form h, as usual. Also suppose 
that a forger has access to all private keys xiy and X 2 ,i and is able to produce a 
forged certificate, say 

{h,,h2,hf^hi\{hfhfr}, 

where 0 <yi,y2 < q is known to the forger. Notice that (yi,y2) should not be 
equal to (0, 0) as then the certificate contains the unity element. As the /12 should 
be an /-th power of hi, it follows from Assumption 21 that hi (resp. ft-2) is a 
power product of the {gi,i} and r (resp. {92,1} and s). Likewise, hf^h^^ is a power 
product of all 9 iY 9 ?i' and h. By choosing the right transformation factors, we 
may assume without loss of generality that hi = r»U i^j ^2 — 

and 



h^h^=hf^l[gi:f92:j\ (1) 

jeJ 

for some subsets I, J of {1, 2, ... , n} and 6, c G {0, 1}. 

We now sketch that we can rule out the possibility that either b, c is equal 
to 1. To this end, suppose the probability that the event that c = 1 to be non- 
negligible. Now, if one simulates /, then one can use the forgery algorithm to 
determine log^{h). Indeed, by feeding the algorithm gi^i (resp. g 2 ,i) that are of 
form (resp. s**), where 0 < U < q known and random and by choosing the 
0 < xiy, X 2 ,i < (7 in a random way {i = 1, 2, ... n). As this is ‘correct’ input, it 
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will lead to equalities of type (1). Now, if c = 1 in any of these equalities then 
the algorithm has produced log^(ft.), which is assumed to be a hard problem. 
Likewise, if the probability that 5 = 1 is non-negligible, then simulation of / the 
forgery algorithm will also enable to determine discrete logarithms with respect 
to r, by basing all gi^i,g 2 ,i on random powers of an element z for which log,,(z) 
is required. Thus we conclude that 5 = c = 0 with overwhelming probability and 
that actually the equations (1) are of type 



Aft* 



=ridi 



92 , > 



(2) 



where hi = ^2 = 92 ,i- Note that the sets I,J cannot be empty 

as the unity element would then occur in the certificate. Moreover, if the set 
/ U J does not contain at least two elements, then / = J is a singleton, and the 
forgery algorithm has in fact produced a transformed user certificate, which is 
not considered a forgery. Now, suppose that log^j(6) is required for some a,b G G, 
then this can be determined with high probability, by basing ‘half’ the gi,i,g 2 ,i 
on random powers of a and the other half on random powers of b. With non- 
negligible probability, the set lU J will contain both a gi,i,g 2 ,i based on a and 
b, and will hence give a relation providing log£,(6). 
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Abstract. In this paper we formalize the notion of a ring signature, 
which makes it possible to specify a set of possible signers without 
revealing which member actually produced the signature. Unlike group 
signatures, ring signatures have no group managers, no setup procedures, 
no revocation procedures, and no coordination: any user can choose any 
set of possible signers that includes himself, and sign any message by 
using his secret key and the others’ public keys, without getting their 
approval or assistance. Ring signatures provide an elegant way to leak 
authoritative secrets in an anonymous way, to sign casual email in a way 
which can only be verified by its intended recipient, and to solve other 
problems in multiparty computations. The main contribution of this 
paper is a new construction of such signatures which is unconditionally 
signer-ambiguous, provably secure in the random oracle model, and 
exceptionally efficient: adding each ring member increases the cost of 
signing or verifying by a single modular multiplication and a single 
symmetric encryption. 

Keywords: signature scheme, ring signature scheme, signer-ambiguous 
signature scheme, group signature scheme, designated verifier signature 
scheme. 



1 Introduction 

The general notion of a group signature scheme was introduced in 1991 by 
Chaum and van Heyst [2] . In such a scheme, a trusted group manager predefines 
certain groups of users and distributes specially designed keys to their members. 
Individual members can then use these keys to anonymously sign messages on 
behalf of their group. The signatures produced by different group members look 
indistinguishable to their verifiers, but not to the group manager who can revoke 
the anonymity of misbehaving signers. 

In this paper we formalize the related notion of ring signature schemes. These 
are simplified group signature schemes which have only users and no managers 
(we call such signatures “ring signatures” instead of “group signatures” since 
rings are geometric regions with uniform periphery and no center) . Group signa- 
tures are useful when the members want to cooperate, while ring signatures are 
useful when the members do not want to cooperate. Both group signatures and 
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ring signatures are signer- ambiguous, but in a ring signature scheme there are 
no prearranged groups of users, there are no procedures for setting, changing, or 
deleting groups, there is no way to distribute specialized keys, and there is no 
way to revoke the anonymity of the actual signer (unless he decides to expose 
himself). Our only assumption is that each member is already associated with 
the public key of some standard signature scheme such as RSA. To produce a 
ring signature, the actual signer declares an arbitrary set of possible signers that 
includes himself, and computes the signature entirely by himself using only his 
secret key and the others’ public keys. In particular, the other possible signers 
could have chosen their RSA keys only in order to conduct e-commerce over the 
internet, and may be completely unaware that their public keys are used by a 
stranger to produce such a ring signature on a message they have never seen and 
would not wish to sign. 

The notion of ring signatures is not completely new, but previous references 
do not crisply formalize the notion, and propose constructions that are less effi- 
cient and/or that have different, albeit related, objectives. They tend to describe 
this notion in the context of general group signatures or multiparty construc- 
tions, which are quite inefficient. For example, Chaum et al. [2]’s schemes three 
and four, and the two signature schemes in Definitions 2 and 3 of Camenisch’s 
paper [1] can be viewed as ring signature schemes. However the former schemes 
require zero-knowledge proofs with each signature, and the latter schemes require 
as many modular exponentiations as there are members in the ring. Cramer et 
al. [3] shows how to produce witness-indistinguishable interactive proofs. Such 
proofs could be combined with the Fiat-Shamir technique to produce ring sig- 
nature schemes. Similarly, DeSantis et al. [10] show that interactive SZK for 
random self-reducible languages are closed under monotone boolean operations, 
and show the applicability of this result to the construction of a ring signature 
scheme (although they don’t use this terminology). 

The direct construction of ring signatures proposed in this paper is based on 
a completely different idea, and is exceptionally efficient for large rings (adding 
only one modular multiplication and one symmetric encryption per ring mem- 
ber both to generate and to verify such signatures). The resultant signatures 
are unconditionally signer-ambiguous and provably secure in the random oracle 
model. 

2 Definitions and Applications 

2.1 Ring Signatures 

Terminology: We call a set of possible signers a ring. We call the ring member 
who produces the actual signature the signer and each of the other ring members 
a non-signer. 

We assume that each possible signer is associated (via a PKI directory or 
certificate) with a public key Pk that defines his signature scheme and specifies 
his verification key. The corresponding secret key (which is used to generate reg- 
ular signatures) is denoted by Sk . The general notion of a ring signature scheme 
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does not require any special properties of these individual signing schemes, but 
our simplest construction assumes that they use trapdoor one-way permutations 
(such as the RSA functions) to generate and verify signatures. 

A ring signature scheme is defined by two procedures: 

— ring-sign(m, Pi, P2, ■ • • , Pr, s, Sg) which produces a ring signature a for the 
message m, given the public keys Pi, P2, . . . , Pr of the r ring members, 
together with the secret key Sg of the s-th member (who is the actual signer) . 

— ring-verify(m, ct) which accepts a message m and a signature a (which 
includes the public keys of all the possible signers), and outputs either true 
or false. 

A ring signature scheme is set-up free: The signer does not need the knowl- 
edge, consent, or assistance of the other ring members to put them in the ring 
- all he needs is knowledge of their regular public keys. Different members can 
use different independent public key signature schemes, with different key and 
signature sizes. Verification must satisfy the usual soundness and completeness 
conditions, but in addition we want the signatures to be signer- ambiguous in 
the sense that the verifier should be unable to determine the identity of the 
actual signer in a ring of size r with probability greater than 1/r. This limited 
anonymity can be either computational or unconditional. Our main construction 
provides unconditional anonymity in the sense that even an infinitely powerful 
adversary with access to an unbounded number of chosen-message signatures 
produced by the same ring member cannot guess his identity with any advan- 
tage, and cannot link additional signatures to the same signer. 



2.2 Leaking Secrets 

To motivate the title for this paper, suppose that Bob (also known as “Deep 
Throat” ) is a member of the cabinet of Lower Kryptonia, and that Bob wishes 
to leak a juicy fact to a journalist about the escapades of the Prime Minister, 
in such a way that Bob remains anonymous, yet such that the journalist is 
convinced that the leak was indeed from a cabinet member. 

Bob cannot send to the journalist a standard digitally signed message, since 
such a message, although it convinces the journalist that it came from a cabinet 
member, does so by directly revealing Bob’s identity. 

It also doesn’t work for Bob to send the journalist a message through a 
standard anonymizer, since the anonymizer strips off all source identification and 
authentication: the journalist would have no reason to believe that the message 
really came from a cabinet member at all. 

A standard group signature scheme does not solve the problem, since it re- 
quires the prior cooperation of the other group members to set up, and leaves 
Bob vulnerable to later identification by the group manager, who may be con- 
trolled by the Prime Minister. 

The correct approach is for Bob to send the story to the journalist through 
an anonymizer, signed with a ring signature scheme that names each cabinet 
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member (including himself) as a ring member. The journalist can verify the 
ring signature on the message, and learn that it definitely came from a cabinet 
member. He can even post the ring signature in his paper or web page, to prove to 
his readers that the juicy story came from a reputable source. However, neither 
he nor his readers can determine the actual source of the leak, and thus the 
whistleblower has perfect protection even if the journalist is later forced by a 
judge to reveal his “source” (the signed document). 



2.3 Designated Verifier Signature Schemes 

A designated verifier signature scheme is a signature scheme in which signatures 
can only be verified by a single “designated verifier” chosen by the signer. This 
concept was first introduced by Jakobsson Sako and Impagliazzo at Eurocrypt 96 
[6] . A typical application is to enable users to authenticate casual emails without 
being legally bound to their contents. For example, two companies may exchange 
drafts of proposed contracts. They wish to add to each email an authenticator, 
but not a real signature which can be shown to a third party (immediately or 
years later) as proof that a particular draft was proposed by the other company. 
A designated verifier scheme can thus be viewed as a “light signature scheme” 
which can authenticate messages to their intended recipients without having the 
nonrepudiation property. 

One approach would be to use zero knowledge interactive proofs, which can 
only convince their verifiers. However, this requires interaction and is difficult 
to integrate with standard email systems and anonymizers. We can use non- 
interactive zero knowledge proofs, but then the authenticators become signatures 
which can be shown to third parties. Another approach is to agree on a shared 
secret symmetric key k, and to authenticate each contract draft by appending a 
message authentication code (MAC) for the draft computed with key k. A third 
party would have to be shown the secret key to validate a MAC, and even then 
he wouldn’t know which of the two companies computed the MAC. However, 
this requires an initial set-up procedure, in which we still face the problem of 
authenticating the emailed choice of k without actually signing it. 

A designated verifier scheme provides a simple solution to this problem: com- 
pany A can sign each draft it sends, naming company B as the designated verifier. 
This can be easily achieved by using a ring signature scheme with companies A 
and B as the ring members. Just as with a MAC, company B knows that the 
message came from company A (since no third party could have produced this 
ring signature), but company B cannot prove to anyone else that the draft of the 
contract was signed by company A, since company B could have produced this 
draft by itself. Unlike the case of MAC’s, this scheme uses public key cryptogra- 
phy, and thus A can send unsolicited email to B signed with the ring signature 
without any preparations, interactions, or secret key exchanges. By using our 
proposed ring signature scheme, we can turn standard signature schemes into 
designated verifier schemes which can be added at almost no cost as an extra 
option to any email system. 
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2.4 Efficiency of Our Ring Signature Scheme 

When based on Rabin or RSA signatures, our ring signature scheme is particu- 
larly efficient: 

— signing requires one modular exponentiation, plus one or two modular mul- 
tiplications for each non-signer. 

— verification requires one or two modular multiplications for each ring mem- 
ber. 

In essence, generating or verifying a ring signature costs the same as generat- 
ing or verifying a regular signature plus an extra multiplication or two for each 
non-signer, and thus the scheme is truly practical even when the ring contains 
hundreds of members. It is two to three orders of magnitude faster than Ca- 
menisch’s scheme, whose claimed efficiency is based on the fact that it is 4 times 
faster than earlier known schemes (see bottom of page 476 in his paper [1]). 
In addition, a Camenisch-like scheme uses linear algebra in the exponents, and 
thus requires all the members to use the same prime modulus p in their indi- 
vidual signature schemes. One of our design criteria is that the signer should be 
able to assemble an arbitrary ring without any coordination with the other ring 
members. In reality, if one wants to use other users’ public keys, they are much 
more likely to be RSA keys, and even if they are based on discrete logs, different 
users are likely to have different moduli p. The only realistic way to arrange a 
Camenisch-like signature scheme is thus to have a group of consenting parties. 

Note that the size of any ring signature must grow linearly with the size of 
the ring, since it must list the ring members; this is an inherent disadvantage of 
ring signatures as compared to group signatures that use predefined groups. 

3 The Proposed Ring Signature Scheme (RSA Version) 

Suppose that Alice wishes to sign a message m with a ring signature for the ring 
of r individuals Ai, A 2 , . . . , A, where the signer Alice is Ag, for some value of 
S) 1 ^ s ^ r. To simplify the presentation and proof, we first describe a ring 
signature scheme in which all the ring members use RSA [9] as their individual 
signature schemes. The same construction can be used for any other trapdoor 
one way permutation, but we have to modify it slightly in order to use trapdoor 
one way functions (as in, for example, Rabin’s signature scheme [8]). 

3.1 RSA Trap-Door Permutations 

Each ring member Ai has an RSA public key Pi = (rii,ei) which specifies the 
trapdoor one-way permutation fi of Z„^: 

fi{x) = X®* (mod m) . 

We assume that only Ai knows how to compute the inverse permutation 
f~^ efficiently, using trap-door information; this is the original Diffie-Hellman 
model [4] for public-key cryptography. 
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Extending trap-door permutations to a common domain 

The trap-door RSA permutations of the various ring members will have do- 
mains of different sizes (even if all the moduli Ui have the same number of bits). 
This makes it awkward to combine the individual signatures, and thus we extend 
all the trap-door permutations to have as their common domain the same set 
{0, 1}^, where 2** is some power of two which is larger than all the moduli rzi’s. 

For each trap-door permutation fi over Z„^ , we define the extended trap-door 
permutation gi over {0, 1}^ in the following way. For any 6-bit input m define 
nonnegative integers qi and r* so that m = giUi + Vi and 0 < r* < n^. Then 

I s, _ / qirii + fi{n) if {qi + l)rii < 2'’ 

Intuitively, gi is defined by using fi to operate on the low-order digit of the n^-ary 
representation of m, leaving the higher order digits unchanged. The exception is 
when this might cause a result larger than 2^ — 1, in which case m is unchanged. 
If we choose a sufficiently large b (e.g. 160 bits larger than any of the rii), the 
chance that a randomly chosen m is unchanged by the extended gi becomes 
negligible. (A stonger but more expensive approach, which we don’t need, would 
use instead of gi{m) the function g'i{m) = gi{{2^ ~ 1) ~ which can modify 

all its inputs). The function gi is clearly a permutation over {0, 1}^, and it is a 
one-way trap-door permutation since only someone who knows how to invert fi 
can invert gi efficiently on more than a negligible fraction of the possible inputs. 

3.2 Symmetric Encryption 

We assume the existence of a publicly defined symmetric encryption algorithm 
E such that for any key k of length I, the function Ek is a permutation over 6-bit 
strings. Here we use the random (permutation) oracle model which assumes that 
all the parties have access to an oracle that provides truly random answers to new 
queries of the form Ek{x) and E^^{y), provided only that they are consistent 
with previous answers and with the requirement that Ek be a permutation (e.g. 
see [7]). 

3.3 Hash Functions 

We assume the existence of a publicly defined collision-resistant hash function h 
that maps arbitrary inputs to strings of length /, which are used as keys for E. 
We model h as & random oracle. (Since h need not be a permutation, different 
queries may have the same answer, and we will disallow “h~^” queries.) 

3.4 Combining Functions 

We define a family of keyed “combining functions” Ck,v{yi,y 2 , ■ ■ ■ ,yr) which 
take as input a key k, an initialization value u, and arbitrary values y\, y 2 , ■ ■ ■ , 
yr in {0, 1}^. Each such combining function uses Ek as a sub-procedure, and 
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produces as output a value z in {0,1}^ such that given any fixed values for k 

and V, we have the following properties. 

1. Permutation on each input: For each s, 1 < s < r, and for any fixed 
values of all the other inputs yi, i yf s, the function Ck,v is a one-to-one 
mapping from to the output z. 

2. Efficiently solvable for any single input: For each s, 1 < s < r, given a 
&-bit value z and values for all inputs yi except ys^ it is possible to efficiently 
find a 6-bit value for j/g such that Ck,v{yi,y 2 i ■ • ■ , J/r) = 2 . 

3. Infeasible to solve verification equation for all inputs without 
trap-doors: Given k, v, and z, it is infeasible for an adversary to solve the 
equation 

Ck,v{gi{xi),g2{x2), ■ ■ .,gr{Xr)) = Z (1) 

for xi, X 2 , • • • , Xr, (given access to each gi, and to Ek) if the adversary can’t 
invert any of the trap-door functions gi, g 2 , • • • , gr- 

For example, the function 



Ck,v{yi^y2, ■ ■ ■ ,yr) = yi ®y2® ■ ■ ■ ®yr 

(where 0 is the exclusive-or operation on 6-bit words) satisfies the first two of the 
above conditions, and can be kept in mind as a candidate combining function. 
Indeed, it was the first one we tried. But it fails the third condition since for any 
choice of trapdoor one-way permutations gi, it is possible to use linear algebra 
when r is large enough to find a solution for x\, X 2 , ■ ■ ■ , Xr without inverting any 
of the gi’s. The basic idea of the attack is to choose a random value for each Xi, 
and to compute each yi = gi{xi) in the easy forward direction. If the number of 
values r exceeds the number of bits 6, we can find with high probability a subset 
of the yi bit strings whose XOR is any desired 6-bit target z. However, our goal 
is to represent z as the XOR of all the values yi,y 2 , ■ ■ ■ ,yr rather than as a XOR 
of a random subset of these values. To overcome this problem, we choose for each 
i two random values x' and x'{, and compute their corresponding y[ = gi{x'i) 
and y" = gi{x"). We then define for each i y”' = y' © y", and modify the target 
value to z' = z (B y[ (B y' 2 , ■ ■ ■ ® y'r- We use the previous algorithm to represent 
z' as a XOR of a random subset of y'" values. After simplification, we get a 
representation of the original z as the XOR of a set of r values, with exactly 
one value chosen from each pair (y', y"). By choosing the corresponding value of 
either x\ or x'( , we can solve the verification equation without inverting any of 
the trapdoor one-way permutations y^. (One approach to countering this attack, 
which we don’t explore further here, is to let 6 grow with r.) 

Even worse problems can be shown to exist in other natural combining func- 
tions such as addition mod 2^. Assume that we use the RSA trapdoor func- 
tions gi{xi) = xf(mod rii) where all the moduli rii have the same size 6. It is 
known [5] that any nonnegative integer z can be efficiently represented as the 
sum of exactly nine nonnegative integer cubes x\ + x\ + ■ ■ ■ + x^. If z is a 6-bit 
target value, we can expect each one of the xf to be slightly shorter than z, 
and thus their values are not likely to be affected by reducing each xf modulo 
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the corresponding &-bit Ui. Consequently, we can solve the verification equa- 
tion (a;f mod ni) -I- (xf mod U2) ... -I- {xg mod rig) = 2 ;(mod 2 ^) with nine RSA 
permutations without inverting any one of them. 

Our proposed combining function utilizes the symmetric encryption function 
Ek as follows: 

Ck,v{yi,y 2 , ■■■,yr) = Ek{yr ® Ek{yr-i ® Ek{yr -2 © Ek {. . .®Ek{yi ©w) . . .)))) . 

This function is applied to the sequence {y\,y 2 , - ■ ■ ,yr), where yi = gi{xi), as 
shown in Figure 1; the resulting function is provably secure in the random oracle 
model. 

v ►© -Ek -© -Ek ► ... ►© -Ek -z 

A A A 

yi=g,(Xi) y, = g,(x,) y, =g^(x^) 

Xl X2 Xr 

Fig. 1. An illustration of the proposed combining function 



It is clearly a permutation on each input, since the XOR, gi, and E^ functions 
are permutations. In addition, it is efficiently solvable for any single input since 
knowledge of k makes it possible to run the evaluation forwards from the initial 
V and backwards from the final 2 ; in order to uniquely compute any missing value 
yi. This function can be used to verify signatures by using a hashed version of 
m to choose the symmetric key k, and forcing the output z to be equal to the 
input V. This consistency condition Ck^v{yi,y2, ■ ■ ■ ,yr) = v bends the line into 
the ring shape shown in Fig. 2. 

A slightly more compact ring signature variant can be obtained by always 
selecting 0 as the “glue value” v. This variant is also secure, but we prefer the 
total ring symmetry of our main proposal. 

We now formally describe the signature generation and verification proce- 
dures: 

Generating a ring signature: 

Given the message m to be signed, his secret key Ss, and the sequence of 
public keys Pi, P 2 , ■ ■ ■ , Pr of all the ring members, the signer computes a ring 
signature as follows. 

1. Choose a key: The signer first computes the symmetric key k as the hash 
of the message m to be signed: 

k = h{m) 

(a more complicated variant computes k as h{m, P\, . . . , Pr); however, the 
simpler construction is also secure.) 
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y 3=g 3 (X3 ) 

Fig. 2. Ring signatures 



2. Pick a random glue value: Second, the signer picks an initialization (or 

“glue”) value v uniformly at random from {0, 1}^. 

3. Pick random Xi’s: Third, the signer picks random Xi for all the other ring 

members l<i<r,i^s uniformly and independently from {0, 1}^, and 
computes 

Vi = 9i{xi) . 

4. Solve for ygi Fourth, the signer solves the following ring equation for 

Ck,v{yi,y2, ...,yr) = v. 

By assumption, given arbitrary values for the other inputs, there is a unique 
value for y,, satisfying the equation, which can be computed efficiently. 

5. Invert the signer’s trap-door permutation: Fifth, the signer uses his 

knowledge of his trapdoor in order to invert gs on y^ to obtain Xg'- 

xs = gs^ijjs) ■ 

6. Output the ring signature: The signature on the message m is defined to 

be the (2r + l)-tuple: 

{Pl,P2,. ■ ■ ,Pr]V]Xi,X2j - ■ ■ jXr) ■ 

Verifying a ring signature: 

A verifier can verify an alleged signature 

{Pl,P2,. ■ . ,Pr]V]Xi,X2,. ■ . ,Xr) ■ 



on the message m as follows. 
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1. Apply the trap-door permutations: First, for z = 1, 2, . . . , r the verifier 

computes 

Vi = 9i{xi) . 

2. Obtain k: Second, the verifier hashes the message to compute the encryption 

key k: 

k = h{m) . 

3. Verify the ring equation: Finally, the verifier checks that the z/z’s satisfy 

the fundamental equation: 



Ck,v{yi,y2, ■ ■ ■ ,Vr) = V . ( 2 ) 

If the ring equation (2) is satisfied, the verifier accepts the signature as valid. 
Otherwise the verifier rejects. 



3.5 Security 

The identity of the signer is unconditionally protected with our ring signature 
scheme. To see this, note that for each k and v the ring equation has exactly 
( 2 f))(r-i) solutions, and all of them can be chosen by the signature generation 
procedure with equal probability, regardless of the signer’s identity. This ar- 
gument does not depend on any complexity-theoretic assumptions or on the 
randomness of the oracle. 

The soundness of the ring signature scheme must be computational, since ring 
signatures cannot be stronger than the individual signature scheme used by the 
possible signers. Our goal now is to show that in the random oracle model, any 
forging algorithm A which can generate with non-negligible probability a new 
ring signature for m by analysing polynomially many ring signatures for other 
chosen messages mj yf m, can be turned into an algorithm B which inverts one 
of the trapdoor one-way functions gi on random inputs y with non-negligible 
probability. 

Algorithm A accepts the public keys Pi, P 2 , ..., Pr (but not any of the 
corresponding secret keys) and is given oracle access to h, E, E~^, and to a 
ring signing oracle. It can work adaptively, querying the oracles at arguments 
that may depend on previous answers. Eventually, it must produce a valid ring 
signature on a new message that was not presented to the signing oracle, with a 
non-negligible probability (over the random answers of the oracles and its own 
random tape). 

Algorithm B uses algorithm A as a black box, but has full control over its 
oracles. A must query the oracle about all the symmetric encryptions along 
the forged ring signature of m (otherwise the probability of satisfying the ring 
equation becomes negligible). Without loss of generality, we can assume that each 
one of these r symmetric encryptions is queried once either in the “clockwise” E^ 
direction or in the “counterclockwise” direction, but not in both directions 
since this is redundant. When A makes its polynomially many querries of Ek and 
with various keys k = h{m), B can guess which k will be involved in the 
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actual forgery with non-negligible probability, but it cannot guess which subset 
of r queries will be used in the final forgery and in which order they will occur 
along the satisfied ring equation since there are too many possibilities. 

Algorithm B can easily simulate the ring signing oracle for all the other 
nij by providing random vectors {v, X\,X 2 , ■ • ■ , Xr) as their ring signatures, and 
adjusting the random answers for queries of the form and ^ to 

support the correctness of the ring equation for these messages. Note that A 
cannot ask relevant oracle questions which will limit B's freedom of choice before 
providing mj to the signing oracle since all the values along the actual ring 
signature (including v) are chosen randomly by B when it provides the requested 
signature, and cannot be guessed in advance by A. In addition, we use the 
assumption that h is collision resistant to show that E and E~^ queries with 
key kj = h{mj) will not constrain the answers to E and E~^ queries with key 
k = h{m) which will be used in the final forgery, since they use different keys. 

The goal of algorithm B is to compute for some i Xi = g~ (y) for random 
inputs y’s with non-negligible probability. This will reduce the security of the 
ring signature to the security of the individual signature schemes. The basic idea 
of the reduction is to slip this random y as the “gap” between the output and 
input values of two cyclically consecutive E's along the ring equation of the final 
forgery, which forces A to close the gap by providing the corresponding Xi in 
the generated signature. Note that y is a random value which is known to B 
but not to A, and thus A cannot “recognize the trap” and refuse to sign the 
corresponding messages. 

The main difficuly is that A can close gaps between E values not only by 
inverting trapdoor one-way functions, but also by evaluating these functions in 
the easy forward direction (as done by the real signer in the generation of ring 
signatures). To overcome this difficulty, we note that in any valid ring signature 
produced by A, there must be a gap somewhere between two cyclically consecu- 
tive occurences of E in which the queries were computed in one of the following 
three ways: 

— The oracle for the i-th E was queried in the “clockwise” direction and the 
oracle for the i + 1-st E was queried in the “counterclockwise” direction. 

— Both E's were queried in the “clockwise” direction, but the i-th E was 
queried after the i + 1-st E. 

— Both E's were queried in the “counterclockwise” direction, but the i-th E 
was queried before the i + 1-st E. 

In all these cases, B can provide a random answer to the later query which 
is based on his knowledge of input and output of the earlier query in such a way 
that the XOR of the values acros the gap is the desired y. This will force A to 
compute the corresponding g~^{y) in order to fill in this gap in its final ring 
signature. 

B does not know which queries will be these cyclically consecutive queries in 
the forged ring signature, and thus he has to guess their identity. However, he has 
to make only two guesses and thus the probability of guessing correctly is 1 /Q^ 
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where Q is the total number of queries made by the forger A. Consequently, B 
will manage to compute g^^{y) for a random y and some i with non-negligible 
probability. 

When the trapdoor one-way functions gt are RSA functions, we can slightly 
strengthen the result. Since RSA is homomorphic, we can randomize y by com- 
puting y' = y * t®*(mod rii) for a randomly chosen t. By using y' instead of y, 
we can show that successful forgeries of ring signatures can be used to extract 
modular roots from particular numbers such as y = 2, and not just from ran- 
dom inputs y. This is not necessarily true for other trapdoor functions, since the 
forger A can intentionally decide not to produce any forgeries in which one of 
the gaps between cyclically consecutive E functions happens to be 2. 

4 Our Ring Signature Scheme (Rabin Version) 

Rabin’s public-key cryptosystem [8] has more efficient signature verification than 
RSA, since verification involves squaring rather than cubing, which reduces the 
number of modular multiplications from 2 to 1. However, we need to deal with 
the fact that the Rabin mapping fi{xi) = xf (mod rii) is not a permutation over 
Z* ., and thus only one quarter of the messages can be signed, and those which 
can be signed have multiple signatures. 

The operational fix is the natural one: when signing, change your last random 
choice of Xs-i if g~^{ys) is undefined. Since only one trapdoor one-way function 
has to be inverted, the signer should expect on average to try four times before 
succeeding in producing a ring signature. The complexity of this search is essen- 
tially the same as in the case of regular Rabin signatures, regardless of the size 
of the ring. 

A more important difference is in the proof of unconditional anonymity, which 
relied on the fact that all the mappings were permutations. When the gi are 
not permutations, there can be noticable differences between the distribution of 
randomly chosen and computed Xi values in given ring signatures. This could 
lead to the identification of the real signer among all the possible signers, and 
can be demonstrated to be a real problem in many concrete types of trapdoor 
one-way functions. 

We overcome this difficulty in the case of Rabin signatures with the following 
simple observation: 

Theorem 1. Let S he a given finite set of “marbles” and let Bi, B 2 , ■ ■ ■ , B„ 
be disjoint subsets of S (called “buckets”) such that all non-empty buckets have 
the same number of marbles, and every marble in S is in exactly one bucket. 
Consider the following sampling procedure: pick a bucket at random until you 
find a non-empty bucket, and then pick a marble at random from that bucket. 
Then this procedure picks marbles from S with uniform probability distribution. 

Proof. Trivial. □ 

Rabin’s functions fi{xi) = x( (mod Ui) are extended to functions gi{xi) over 
{0,1} in the usual way. Both the marbles and the buckets are all the 6-bit 
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numbers u = qiTii + ri in which G Z* . and + l)nj < 2^ Each marble is 
placed in the bucket to which it is mapped by the extended Rabin mapping gi . We 
know that each bucket contains either zero or four marbles, and the lemma inplies 
that the sampled distribution of the marbles Xi is exactly the same regardless of 
whether they were chosen at random or picked at random among the computed 
inverses in a randomly chosen bucket. Consequently, even an infinitely powerful 
adversary cannot distinguish between signers and nonsigners by analysing actual 
ring signatures produced by one of the possible signers. 

5 Generalizations and Special Cases 

The notion of ring signatures has many interesting extensions and special cases. 
In particular, ring signatures with r = 1 can be viewed as a randomized version 
of Rabin’s signature scheme: As shown in Fig. 3, the verification condition can 
be written as mod n) = w © (v) . The right hand side is essentially a 

hash of the message m, randomized by the choice of v. 

Ring signatures with r = 2 have the ring equation: 

® tD u)) = V 

(see Fig. 3). A simpler ring equation (which is not equivalent but has the same 
security properties) is: 

{xi mod ni) = F^(m)(a :2 mod ri 2 ) 

where the modular squares are extended to {0, 1}** in the usual way. This is our 
recommended method for implementing designated verifier signatures in email 
systems, where ni is the public key of the sender and ri 2 is the public key of the 
recipient. 



z=v 




Fig. 3. Rabin-based Ring Signatures with r = 1, 2 



In regular ring signatures it is provably impossible for an adversary to expose 
the signer’s identity. However, there may be cases in which the signer himself 
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wants to have the option of later proving his authorship of the anonymized 
email (e.g., if he is successful in toppling the disgraced Prime Minister). Yet 
another possibility is that the signer A wants to initially use {A,B,C} as the 
list of possible signers, but later prove that C is not the real signer. There is 
a simple way to implement these options, by choosing the Xi values for the 
nonsigners in a pseudorandom rather than truly random way. To show that C is 
not the author, A publishes the seed which pseudorandomly generated the part 
of the signature associated with C. To prove that A is the signer, A can reveal a 
single seed which was used to generate all the nonsigners’ parts of the signature. 
The signer A cannot misuse this technique to prove that he is not the signer 
since his part is computed rather than generated, and is extremely unlikely to 
have a corresponding seed. Note that these modified versions can guarantee only 
computational anonymity, since a powerful adversary can search for such proofs 
of nonauthorship and use them to expose the signer. 
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Abstract. We consider a novel security requirement of encryption sche- 
mes that we call “key-privacy” or “anonymity”. It asks that an eaves- 
dropper in possession of a ciphertext not be able to tell which specific 
key, out of a set of known public keys, is the one under which the cipher- 
text was created, meaning the receiver is anonymous from the point of 
view of the adversary. We investigate the anonymity of known encryp- 
tion schemes. We prove that the El Carnal scheme provides anonym- 
ity under chosen- plaint ext attack assuming the Decision Diffie-Hellman 
problem is hard and that the Cramer-Shoup scheme provides anonym- 
ity under chosen-ciphertext attack under the same assumption. We also 
consider anonymity for trapdoor permutations. Known attacks indicate 
that the RSA trapdoor permutation is not anonymous and neither are 
the standard encryption schemes based on it. We provide a variant of 
RSA-OAEP that provides anonymity in the random oracle model assu- 
ming RSA is one-way. We also give constructions of anonymous trapdoor 
permutations, assuming RSA is one-way, which yield anonymous encryp- 
tion schemes in the standard model. 



1 Introduction 

The classical security requirement of an encryption scheme is that it provide pri- 
vacy of the encrypted data. Popular formalizations — such as indistinguishability 
(semantic security) [22] or non-malleability [15], under either chosen-plaintext 
or various kinds of chosen-ciphertext attacks [27,29] — are directed at capturing 
various data-privacy requirements. (See [5] for a comprehensive treatment). 

In this paper we consider a different (additional) security requirement of 
an encryption scheme which we call key-privacy or anonymity. It asks that the 
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encryption provide (in addition to privacy of the data being encrypted) privacy 
of the key under which the encryption was performed. 

This might sound odd, especially in the public-key setting which is our main 
focus: here the key under which encryption is performed is the public key of 
the receiver and being public there might not seem to be anything to keep 
private about it. The privacy refers to the information conveyed to the adversary 
regarding which specific key, out of a set of known public keys, is the one under 
which a given ciphertext was created. We call this anonymity because it means 
that the receiver is anonymous from the point of view of the adversary. 

Anonymity of encryption has surfaced in various different places in the past, 
and found several applications, as we detail later. However, it lacks a compre- 
hensive treatment. Our goal is to provide definitions, and then systematically 
study popular asymmetric encryption schemes with regard to their meeting these 
definitions. Below we discuss our contributions and then discuss related work. 

1.1 Definitions 

We suggest a notion we call “indistinguishability of keys” to formalize the pro- 
perty of key-privacy. In the formalization, the adversary knows two public keys 
pkg,pJc]^, corresponding to two different entities, and gets a ciphertext C formed 
by encrypting some data under one of these keys. Possession of C should not 
give the adversary an advantage in determining under which of the two keys 
C was created. This can be considered under either chosen-plaintext attack or 
chosen-ciphertext attack, yielding two notions of security, IK-CPA and IK-CCA. 

We also introduce the notion of an anonymous trapdoor permutation, which 
will serve as tool in some of the designs. 

1.2 The Search for Anonymous Asymmetric Encryption Schemes 

In a heterogenous public-key environment, encryption will probably fail to be 
anonymous for trivial reasons. For example, different users might be using diffe- 
rent cryptosytems, or, if the same cryptosystem, have keys of different lengths. 
(If one possible recipient has a RSA public key with a 1024 bit modulus and the 
other a RSA public key with a 512 bit modulus, the length of the RSA cipher- 
text will immediately enable an eavesdropper to know for which recipient the 
ciphertext is intended.) We can however hope for anonymity in a context where 
all users use the same security parameter or global parameters. We will look at 
specific systems with this restriction in mind. 

Ideally, we would like to be able to prove that popular, existing and practical 
encryption schemes have the anonymity property (rather than having to design 
new schemes.) This would be convenient because then existing encryption-using 
protocols or software would not have to be altered in order for them to have the 
anonymity guarantees conferred by those of the encryption scheme. Accordingly, 
we begin by examining existing schemes. We will consider discrete log based 
schemes such as El Gamal and Cramer-Shoup, and also RSA-based schemes 
such as RSA-OAEP. 
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It is easy to see that an encryption scheme could meet even the strongest 
notion of data-privacy — namely indistinguishability under chosen-ciphertext 
attack — yet not provide key-privacy. (The ciphertext could contain the public 
key.) Accordingly, existing results about data-privacy of asymmetric encryption 
schemes are not directly applicable. Existing schemes must be re-analyzed with 
regard to key-privacy. 

In approaching this problem, we had no a priori way to predict whether or 
not a given asymmetric scheme would have the key-privacy property, and, if it 
did, whether the proof would be a simple modification of the known data privacy 
proof, or require new techniques. It is only by doing the work that one can tell 
what is involved. 

We found that the above-mentioned discrete log based schemes did have the 
key-privacy property, and, moreover, that it was possible to prove this, under the 
same assumptions as used to prove data-privacy, by following the outline of the 
proofs of data-privacy with appropriate modifications. This perhaps unexpected 
strength of the discrete log based world (meaning not only the presence of the 
added security property in the popular schemes, but the fact that the existing 
techniques are strong enough to lead to a proof) seems important to highlight. In 
contrast, folklore attacks already rule out key-privacy for standard RSA-based 
schemes. Accordingly, we provide variants that have the property. Let us now 
look at these results in more detail. 



1.3 Discrete Log Based Schemes 

The El Gamal cryptosystem over a group of prime order provably provides 
data-privacy under chosen-plaintext attack assuming the DDH (Decision Diffie- 
Hellman) problem is hard in the group [25,12,33,3]. Let us now consider a system 
of users all of which work over the same group. (To be concrete, let <7 be a prime 
such that 2g -I- 1 is also prime, let Gq be the order q subgroup of quadratic resi- 
dues of ^ 2 g-i-i let g G Gq he a, generator of Gq. Then q,g are system wide 
parameters based on which all users choose keys.) In this setting we prove that 
the El Gamal scheme meets the notion of IK-GPA under the same assumption 
used to establish data-privacy, namely the hardness of the DDH problem in the 
group. Thus the El Gamal scheme provably provides anonymity. Our proof ex- 
ploits self-reducibility properties of the DDH problem together with ideas from 
the proof of data-privacy. 

The Gramer-Shoup scheme [12] is proven to provide data-privacy under 
chosen-ciphertext attack, under the assumption that the DDH problem is hard 
in the group underlying the scheme. Let us again consider a system of users, 
all of which work over the same group, and for concreteness let it be the group 
Gq that we considered above. In this setting we prove that the Gramer-Shoup 
scheme meets the notion of IK-GGA assuming the DDH problem is hard in Gq. 
Our proof exploits ideas in [12,3]. 




Key-Privacy in Public-Key Encryption 569 



1.4 RSA-Based Schemes 

A simple observation that seems to be folkore is that standard RSA encryption 
does not provide anonymity, even when all modulii in the system have the same 
length. In all popular schemes, the ciphertext is (or contains) an element y = 
cc® mod N where a; is a random member of Z^. Suppose an adversary knows that 
the ciphertext is created under one of two keys Ao,eo or Ai,ei, and suppose 
No < Ni. If y > No then the adversary bets it was created under Ai,ei, else 
it bets it was created under Ao,eo- It is not hard to see that this attack has 
non-negligible advantage. 

One approach to anonymizing RSA, suggested by Desmedt [14], is to add 
random multiples of the modulus N to the ciphertext. This seems to overcome 
the above attack, at least when the data encrypted is random, but results in a 
doubling of the length of the ciphertext. We look at a few other approaches. 

We consider an RSA-based encryption scheme popular in current practice, 
namely RSA-OAEP [8]. (It is the PKCS v2.0 standard [28], proved secure against 
chosen-ciphertext attack in the random oracle model [18].) We suggest a variant 
which we can prove is anonymous. Recall that OAEP is a randomized (invertible) 
transform that on input a message M picks a random string r and, using some 
public hash functions, produces a point x = OAEP(r, M) e Z*^ where N, e is the 
public key of the receiver. The ciphertext is then y = mod N . Our variant 
simply repeats the ciphertext computation, each time using new coins, until the 
ciphertext y satisfies I < y < 2*“^, where k is the length of N. We prove that 
this scheme meets the notion of IK-CCA in the random oracle model assuming 
RSA is a one-way function. (Data-privacy under chosen-ciphertext attack must 
be re-proved, but this can be done, under the same assumption, following [18].) 
The expected number of exponentiations for encryption being two, encryption in 
our variant is about twice as expensive as for RSA-OAEP itself, but this may be 
tolerable when the encryption exponent is small. The cost of decryption is the 
same as for RSA-OAEP itself, namely one exponentiation with the decryption 
exponent. As compared to Desmedt ’s scheme, the size of the ciphertext increases 
by only one bit rather than doubling. Our proof exploits the framework and 
techniques of [18,8]. 

1.5 Trapdoor Permutation Based Schemes 

We then ask a more theoretical, or foundational, question, namely whether there 
exists an encryption scheme that can be proven to provide key-privacy based 
only on the assumption that RSA is one-way, meaning without making use of 
the random oracle model. To answer this we return to the classical techniques 
based on hardcore bits. We define a notion of anonymity for trapdoor permu- 
tations. We note that the above attack implies that RSA is not an anonymous 
trapdoor permutation, but we then design some trapdoor permutations which 
are anonymous and one-way as long as RSA is one-way. Appealing to known 
results about hardcore bits then yields an encryption scheme whose anonymity 
is proven based solely on the one-wayness of RSA. The computational costs of 
this approach, however, prohibit its being useful in practice. 
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1.6 Applications and Related Work 

In recent years, anonymous encryption has arisen in the context of mobile com- 
munications. Consider a mobile user A, communicating over a wireless network 
with some entity B. The latter is sending A ciphertexts encrypted under A’s 
public key. A common case is that B is a base station. A wants to keep her iden- 
tity private from an eavesdropping adversary. In this case A will be a member 
of some set of users whose identities and public keys are possibly known to the 
adversary. The adversary will also be able to see the ciphertexts sent by B to A. 
If the scheme is anonymous, however, the adversary will be unable to determine 
A’s identity. A particular case of this is anonymous authenticated key exchange, 
where the communication between roaming user A and base station B is for the 
purpose of authentication and distribution of a session key based on the parties 
public keys, but the identity of A should remain unknown to an eavesdropper. 
Anonymity is targeted in authenticated key exchange protocols such as SKEME 
[23]. The author notes that a requirement for SKEME to provide anonymous 
authenticated key exchange is that the public-key encryption scheme used to 
encrypt under A’s public key must have the key-privacy property. 

In independent and concurrent work, Camenisch and Lysyanskaya [10] con- 
sider anonymous credential systems. Such a sytem enables users to control the 
dissemination of information about themselves. It is required that it be infeasi- 
ble to correlate transactions carried out by the same user. The solution to this 
given in [10] makes use of a verifiable circular encryption scheme that needs to 
have the key-privacy property. They provide a notion similar to ours, but in the 
context of verifiable encryption. They observe that their variant of the El Gamal 
scheme is anonymous under chosen-plaintext attack. 

Sako [30] considers the problem of achieving bid secrecy and verifiability 
in auction protocols. Their approach is to express each bid as an encryption 
of a known message, with the key to encrypt it corresponding to the value of 
the bid. Thus, what needs to be hidden is not the message that is encrypted, 
but the key used to encrypt it. The bid itself can be identified by finding the 
corresponding decrypting key that successfully decrypts to the given message. 
Unlike the previous examples, where the key-privacy property was needed to 
protect identities, this application shows how that property can be exploited to 
satisfy a secrecy requirement. Sako also considered a notion similar to ours and 
gave a variant of the El Gamal scheme that was expected to be secure in that 
sense. 

Formal notions of key-privacy have appeared in the context of symmetric 
encryption [1,13,17]. Abadi and Rogaway [1] show that popular modes of opera- 
tion of block ciphers, such as GBG, provide key-privacy if the block cipher is a 
pseudorandom permutation. 

The notion given by Desai [13], like ours, is concerned with the privacy of 
keys. However, the goal, model and setting in which it is considered differs from 
ours — the goal there is to capture a security property for block cipher based 
encryption schemes that implies that exhaustive key-search on them is slowed 
down proportional to the size of the ciphertext. There is, however, a similarity 
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between our definitions (suitably adapted to the symmetric setting) and those 
of Abadi and Rogaway [1] and Fischlin [17]. Although the exact formalizations 
differ, it is not hard to see that there is an equivalence between the three for 
chosen-plaintext attack. 

Chosen-ciphertext attacks do not seem to have been considered before in 
the context of key-privacy. In fact, Fischlin [17] observes that giving decryption 
oracles to the adversary in their setting makes its task trivial. However, in our 
formalization chosen-ciphertext attacks can be modeled by giving decryption 
oracles and then putting an appropriate restriction on their use. The restriction 
is the most natural and is anyway in effect for modeling semantic security against 
chosen-ciphertext attack. This allows us to make a distinction between those 
encryption schemes that are anonymous under chosen-ciphertext attack, such 
as Cramer-Shoup, and those that are not, such as El Gamal — just as there 
are schemes that are semantically secure under chosen-plaintext attack but not 
under chosen-ciphertext attack. 



2 Notions of Key-Privacy 

The notions of security typically considered for encryption schemes are “indi- 
stinguishability of encryptions under chosen-plaintext attack” [22] and “indistin- 
guishability of encryptions under adaptive chosen-ciphertext attack” [29]. The 
former is usually denoted IND-CPA, but is denoted lE-CCA in this paper to 
emphasize that it is about encryptions, not keys. Similarly, the latter notion is 
usually denoted IND-CCA (or IND-CCA2), but is denoted lE-CCA in this pa- 
per. It is well-known that these capture strong data-privacy properties. However, 
they do not guarantee that some partial information about the underlying key 
is not leaked. Indeed, in a public-key encryption scheme, the entire public-key 
could be made an explicit part of the ciphertext and yet the scheme could meet 
the above-mentioned data-privacy notions. We want to make a distinction bet- 
ween such schemes and those that do not leak information about the underlying 
key. As noted earlier, schemes of the latter kind are necessary if the anonymity 
of receivers is a concern. 

We are interested in formalizing the inability of an adversary, given a chal- 
lenge ciphertext, to learn any information about the underlying plaintext or 
key. It is not hard to see that the goals of data-privacy and key-privacy are 
orthogonal. We recognize that existing encryption schemes are likely to have 
already been investigated with respect to their data-privacy security properties. 
Hence it is useful, from a practical point of view, to isolate the key-privacy re- 
quirements from the data-privacy ones. We do this in the form of two notions: 
“indistinguishability of keys under chosen-plaintext attack” (IK-CPA) and “in- 
distinguishability of keys under adaptive chosen-ciphertext attack” (IK-CCA). 
We begin with a syntax for public-key encryption schemes, divorcing syntax from 
formal notions of security. 
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2.1 Syntax 

The syntax of an encryption scheme specifies what algorithms make it up. We 
augment the usual formalization in order to better model practice, where users 
may share some fixed “global” information. 

A public-key encryption scheme VE = {Q, /C, £, V) consists of four algorithms. 
The common-key generation algorithm Q takes as input some security parameter 
k and returns some common key I. (Here I may be just a security parameter k, 
or include some additional information. For example in a Diffie-Hellman based 
scheme, / might include, in addition to k, a global prime number and generator of 
a group which all parties use to create their keys.) The key generation algorithm 
/C is a randomized algorithm that takes as input the common key I and returns 
a pair (pk, sk) of keys, the public key and a matching secret key, respectively; we 
write (pk,sk) The encryption algorithm f is a randomized algorithm 

that takes the public key pk and a plaintext x to return a ciphertext y; we write 
y ^ Epk(x). The decryption algorithm I? is a deterministic algorithm that takes 
the secret key sk and a ciphertext y to return the corresponding plaintext a; or a 
special symbol T to indicate that the ciphertext was invalid; we write x <— 'Dskiv) 
when y is valid and T ^ 'Dsk{y) otherwise. Associated to each public key pk is 
a message space MsgSp(pk) from which x is allowed to be drawn. We require 
that T>sk{Epk{x)) = X for all x G MsgSp(pJc). 



2.2 Indistinguishability of Keys 



We give notions of key-privacy under chosen-plaintext and chosen-ciphertext 
attacks. We think of an adversary running in two stages. In the find stage it 
takes two public keys pip and pk^ (corresponding to secret keys sip and ski, 
respectively) and outputs a message x together with some state information s. In 
the guess stage it gets a challenge ciphertext y formed by encrypting at random 
the messages under one of the two keys, and must say which key was chosen. In 
the case of a chosen-ciphertext attack the adversary gets oracles for T>sko{-) and 
d^skii') and is allowed to invoke them on any point with the restriction (on both 
oracles) of not querying y during the guess stage. 



Definition 1. [IK-CPA, IK-CCA] Let VE = (tj, /C, 5, 1?) he an encryption 
scheme. Let b G {0, 1} and fc G N. Let be adversaries that run in two 

stages and where A,,^^ has access to the oracles I?sico(') I?sici(')- Now, we 
consider the following experiments: 



Experiment ^ (k) 

(pkg,sko) dE (pk^.ski) EE fC{I) 

(x,s) ^ A„p,,(find,pkg,pki) 

y ^ £pk„{x) 

d ^ A^p^(guess,y,s) 

Return d 



Experiment Expp£(“^^^^(fc) 

I ^ Q{k) 

{pkp,sko) EE (pki,ski) EE fC{I) 

(x,s) ^ ^(find,pkQ,pkJ 

y^£pk„{x) 

d ^ A^t° ^ ^ ' (guess, y, s) 

Return d 
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Above it is mandated that never queries T>sko{-) orT>ski{-) on the challenge 
ciphertext y. For atk G {cpa, cca} we define the advantages of the adversaries 
via 



Adv“^Jfc) = Pr[Exp“-I(^) = - Pr[Exp“-;jfc) = 1] . 

The scheme VS is said to be IK-CPA secure (respectively IK-CCA secure) if 
the function (resp. Advp£'^5T(')y^ negligible for any adversary A 

whose time complexity is polynomial in k. | 

The “time-complexity” is the worst case execution time of the experiment plus 
the size of the code of the adversary, in some fixed RAM model of computation. 
(Note that the execution time refers to the entire experiment, not just the advers- 
ary. In particular, it includes the time for key generation, challenge generation, 
and computation of responses to oracle queries if any.) The same convention is 
used for all other definitions in this paper and will not be explicitly mentioned 
again. 



2.3 Anonymous One-Way Functions 



A family of functions F = {K, 5, E) is specified by three algorithms. The ran- 
domized key- generation algorithm K takes input the security parameter A: € N 
and returns a pair (pk, sk) where pk is a public key, and sk is an associated 
secret key. (In cases where the family is not trapdoor, the secret key is sim- 
ply the empty string.) The randomized sampling algorithm S takes input pk 
and returns a random point in a set that we call the domain of pk and denote 
DomiJ’(pk). We usually omit explicit mention of the sampling algorithm and just 
write X Domf’(pi). The deterministic evaluation algorithm E takes input pk 
and a point x G Domi;’(pk) and returns an output we denote by Epk{x). We 
let Rng^(pk) = { Epk{x) : x G Domp{pk) } denote the range of the function 
Epk{-)- We say that E is a family of trapdoor functions if there exists a determi- 
nistic inversion algorithm I that takes input sk and a point y G Rng^(pk) and 
returns a point x G Dom^(pk) such that Epk{x) = y. We say that E is a family 
of permutations if Dom^’ (pJc) = Rng^(pJc) and Epk is a permutation on this set. 



Definition 2. Let E = (K,S,E) be a family of functions. Let b G {0,1} and 
k G N be a security parameter. Let 0 < 0 < ^ be a constant. Let A, B be 
adversaries. Now, we consider the following experiments: 



Experiment Exp^^°'^ ^"“(fc) 

(pk, sk) •£- K{k) 

*i||a :2 ^ DomF(pk) where \xi \ = \6 ■ |(a:i||x 2 )n 

y ^ Epk{xi\\x2) 

x'l ^ B(pk,y) where \x[ \ = |a;i| 

For any x '2 if Epk(x'i\\x 2 ) = y then return 1 
Else return 0 



Experiment Exp)|l *’(fc) 
(pkp.sko) ^ K{k) 
(pkj.ski) K{k) 

X Doiiif (pk^) 
y ^ Epk fix) 

d ^ A{pkQ,pkj^,y) 
Return d 
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We define the advantages of the adversaries via 

Adv^-P3™-f“(fc) = Pr[Exp^-Pj°"'-f“(fc) = 1] 

Adv*^;i“(fc) = Pr[Exp*^;i“-i(fc) = 1] - Pr[Exp‘>;i“-°(fc) = 1] . 

The family F is said to be 9 -partial one-way if the function Adv^*^'^ 
negligible for any adversary B whose time complexity is polynomial in k. The 
family F is said to be anonymous if the function Adv'^~^“(-) is negligible for 
any adversary A whose time complexity is polynomial in k. The family F is said 
to be perfectly anonymous if Adv* = 0 for every k and every adversary 

A. I 

Note that when 9=1 the notion of 0-partial one-wayness coincides with the stan- 
dard notion of one-wayness. As the above indicates, we expect that information- 
theoretic anonymity is possible for one-way functions, even though not for en- 
cryption schemes. 

3 Anonymity of DDH-Based Schemes 

The DDH-based schemes we consider work over a group of prime order. This 
could be a subgroup of order q of Z* where p, q are primes such that q divides 
p — 1. It could also be an elliptic curve group of prime order. For concreteness 
our description is for the first case. Specifically if g is a prime such that 2q-\- 1 
is also prime we let Gq be the subgroup of quadratic residues of Z* . It has order 
q. A prime- order- group generator is a probabilistic algorithm that on input the 
security parameter k returns a pair (q, g) satisfying the following conditions: 
<7 is a prime with < q < 2^; 2q -\- 1 is a, prime; and p is a generator of 

Gq. (There are numerous possible specific prime-order-group generators.) We 
will relate the anonymity of the El Gamal and Cramer-Shoup schemes to the 
hardness of the DDH problem for appropriate prime-order-group generators. 
Accordingly we next summarize definitions for the latter. 

Definition 3. [DDH] Let Q be a prime- order- group generator. Let D be an 
adversary that on input q, g and three elements X,Y,T G Gq returns a bit. We 
consider the following experiments 

Experiment Experiment 

{<1,9) ^G{k) {q,g)^Q{k) 

xZ-Zq - x^Zq-X^g^ 

yfLZq;Y^gy y^Zq-Y^gy 

T ^ g^^y T ^ G 

D{q,g,X,Y,T) D{q,g,X,Y,T) 

Return d Return d 

The advantage of D in solving the Decisional Diffie- Heilman (DDH) problem for 
Q is the function of the security parameter defined by 

Adv^"^(A:) = Pr[Exp^“(fc) = 1 ] - Pr[Exp^"“(A:) = 1] . 

We say that the DDH problem is hard for Q if the function Advg'^])(-) is negligible 
for every algorithm D whose time- complexity is polynomial in k. I 
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3.1 El Gamal 



The El Gamal scheme in a group of prime order is known to meet the notion 
of indistinguishability under chosen-plaintext attack under the assumption that 
the decision Diffie-Hellman (DDH) problem is hard. (This is noted in [25,12] 
and fully treated in [33]). We want to look at the anonymity of the El Gamal 
encryption scheme under chosen-plaintext attack. 

Let 5 be a prime-order-group generator. This is the common key genera- 
tion algorithm of the associated scheme £Q = {Q ,JC,£,V), the rest of whose 
algorithms are as follows: 



Algorithm K,{q, g) 




Pk ^ 

sk ^ {q,g,x) 
Return {pk, sk) 



Algorithm £pk{M) 

^ '7 

y^Zq 

Y 

T ^ xy 
W ^ TM 
Return (K, W) 



Algorithm W) 

^ 

M ^ WT~^ 
Return M 



The message space associated to a public key {q, g, X) is the group Gq itself, with 
the understanding that all messages from Gq are properly encoded as strings of 
some common length whenever appropriate. Note that a generator g is the output 
of the common key generation algorithm, which means we fix g for all keys. We 
do it only for a simplicity reason and will show that all our results hold also for 
a case when each key uses a random generator g. 

We now analyze the anonymity of the El Gamal scheme under chosen-plaintext 
attack. 



Theorem 1. Let Q be a prime- order- group generator. If the DDH problem is 
hard for Q then the associated El Gamal scheme £Q is IK-CPA secure. Con- 
cretely, for any adversary A there exists a distinguisher D such that for any 
k 

< 2Advg(*c(fc) -h 

and the running time of D is that of A plus 0{kf). | 

The proof of the above is in the full version of this paper [2] . 



3.2 Cramer-Shoup 

The El Gamal scheme provides data privacy and anonymity against chosen- 
plaintext attack. We now consider the Gramer-Shoup scheme [12] in order to 
obtain the same security properties under chosen-ciphertext attack. We will use 
collision-resistant hash functions so we begin by recalling what we need. 

A family of hash functions Ti. — {GH, £H) is defined by a probabilistic gene- 
rator algorithm QH — which takes as input the security parameter k and returns 
a key K — and a deterministic evaluation algorithm £7i — which takes as input 
the key K and a string M G {0, 1}* and returns a string £'Hk{M) G {0, 1}^“^. 
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Definition 4. Let H. = {QT-L,ET-L) he a family of hash functions and let C he 
an adversary that on input a key K returns two strings. Now, we consider the 
following experiment: 

Experiment Exp^ ,c(^) 

K ^ QH.{k) ; {xo,xi) ^ C(K) 

If {xq a^i) and SHk{xo) = ET-Lk{xi) then return 1 else return 0 
We define the advantage of adversary C via 

We say that the family of hash functions H is collision-resistant if Adv^ q{-) 
is negligible for every algorithm C whose time- complexity is polynomial in k. I 

Let 5 be a prime-order-group generator. The common key generation algorithm 
of the associated Cramer-Shoup scheme CS — {Q,JC,E,V) is: 



Algorithm 6 (A:) : g2^Gq\ K ^ gH{k); Return {q, gi, g2, K). 



The rest of algorithms are specified as follows: 



Algorithm IC{q, gi, g2, K) 
gi^ 9 

R ry 

xi,x2,yi,y2, z ^ Zq 

C^gr^g2^-,d^gy^gr 

h^gf 

Pk ^ {gi,g2,c,d,h,K) 
sk ^ {xi,X2,yi,V2,z) 
Return (pk, sk) 



Algorithm Spk{M) 

R ry 

r ^ Zq 

Ml ^ 5i ; M2 ^ gl 
e ^ NM 

a <— £Hk{ui, U2, e) 

y ^ 

Return (ui,U2,e, v) 



Algorithm Vsk{ui,U2,e,v) 
a ^ £Hk{ui, U2,e) 

M x^-\-^|^OL, xo-f-ynot 

U\ U 2 ^ = V 

then M ^ ejuN 
else M ^ 1. 

Return M 



The message space is the group Gq. Note that the range of the hash function 
EUk is {0, 1}^“^ which we identify with {0, . . . , 2^“^}. Since q > 2*“^ this is 
a subset of Zq. Again for simplicity we assume that 51,52 sxe fixed for all keys 
but we will show that our results hold even if 51 , 52 are chosen at random for all 
keys. 

We now analyze the anonymity of CS under chosen-ciphertext attack. 

Theorem 2. Let Q he a prime- order- group generator and let CS he the asso- 
ciated Cramer-Shoup scheme. If the DDH problem is hard for Q then CS is 
anonymous in the sense of IK-CCA. Concretely, for any adversary A attacking 
the anonymity ofCS under a chosen- ciphertext attack and making in total queci’) 
decryption oracle queries, there exists a distinguisher D for DDH and an advers- 
ary C attacking the collision-resistance ofH such that 

Advg:X(ft) < 2Advf^(fc) + 2Adv^,e(fc) + ^ . 

and the running time of D and C is that of A plus O(fc^). I 

The proof of the above is in the full version of this paper [2] . Note that security 
of the Cramer-Shoup scheme in the lE-CCA sense has been proven in [12] using 
a weaker assumption on the hash function H than the one we have here. They 
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do not require that H be collision-resistant, as we do, but only that it be a 
universal one-way family of hash functions (UOWHF) [26]. We have at this time 
not determined if the scheme can also be proven secure in the IK-CCA sense 
assuming to be a UOWHF. 

4 Anonymity of RSA-Based Schemes 

The attack on RSA mentioned in Section 1 implies that the RSA family of trap- 
door permutations is not anonymous. This means that all traditional RSA-based 
encryption schemes are not anonymous. We provide several ways to implement 
anonymous RSA-based encryption. First we take a direct approach, specifying 
an anonymous RSA-OAEP variant based on repetition and proving it secure in 
the random oracle model. Then we show how to construct anonymous trapdoor 
permutation families based on RSA and derive anonymous RSA-based encryp- 
tion schemes from them. In particular, the latter leads to anonymous encryption 
schemes whose proofs of security are in the standard rather than the random 
oracle model. We begin with a description of the RSA family of trapdoor per- 
mutations we will use in this section. See Section 2 for notions of security for 
families of trapdoor permutations. 

Example 1. The specifications of the standard RSA family of trapdoor permu- 
tations RSA = (A, S, E) are as follows. The key generation algorithm takes as 
input a security parameter k and picks random, distinct primes p, q in the range 
2 k/ 2 -i ^ p q ^ 2 ^/ 2 , (jf ig odd, increment it by 1 before picking the primes.) 
It sets N = pq. It picks e,d G S'^^h that ed = 1 (mod 7 >(A)) where 

(p{N) = {p—l){q—l). The public key is N, e and the secret key is N, d. The sets 
DomRSA(A, e) and RngRg;\(A, e) are both equal to The evaluation algorithm 
is Ejq f,{x) = a;® mod N and the inversion algorithm is lN,d{y) = y‘^ mod N. The 
sampling algorithm returns a random point in Z^. | 

The anonymity attack on RSA carries over to most encryption schemes based 
on it, including the most popular one, RSA-OAEP. We next describe a variant 
of RSA-OAEP that preserves its data-privacy properties but is in addition ano- 
nymous. 



4.1 Anonymous Variant of RSA-OAEP 

The original scheme and our variant are described in the random-oracle (RO) 
model [7]. All the notions of security, defined earlier, can be “lifted” to the 
RO setting in a straightforward manner. To modify the definitions, begin the 
experiment defining advantage by choosing random functions G and El , each 
from the set of all functions from some appropriate domain to appropriate range. 
Then provide a G-oracle and il-oracle to the adversaries, and allow that £pk and 
T>sk may depend on G and H (which we write as and 

The idea behind our variant is to repeat the standard encryption procedure 
under RSA-OAEP, until the ciphertext falls in some “safe” range. We refer to 
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our scheme as RSA-RAEP (for repeated asymmetric encryption with padding). 
More concretely, for RSA = {K, S, E), our scheme RSA-RAEP = {Q,JC,S,V) is 
as follows. The common key generator algorithm Q takes a security parameter k 
and returns parameters k, ko and ki such that ko{k) + ki{k) < k for all fc > 1. 
This defines an associated plaintext-length function n{k) = k—ko{k) — ki{k). The 
key generation algorithm K, takes k, ko, k\ and runs the key-generation algorithm 
of the RSA family, namely A on A: to get a public key {N, e) and secret key {N, d) 
(see Example 1). The public key for the scheme pk is {N,e),k,ko,k\ and the 
secret key sk is {N,d),k,ko,ki. The other algorithms are depicted below. The 
oracles G and H which Spk and T>sk reference below map bit strings as follows: 
G : {0, {0, 1}"+'=! and H : {0, {0, !}'=«. 



Algorithm 
ctr = — 1 
Repeat 

ctr <— ctr -I- 1 
r ^ {0,1}''“ 

(®||0''i)©G(r) 
t ^ r(BH{s) 

V <— (s||t)" mod N 
Until {v < 2'“-^) V (ctr = fci) 

If ctr = fci then y <— \\x 

Else y <— 0||v 
Return y 



Algorithm (y) 

Parse y as b\\v where fe is a bit 
If b = 1 then parse v as w||® where \x\ 
If w = then z ^ x 

Else (if w / z^± 

Else (if fe = 0) 

(s||t) <— w'* mod N where: 

|s| = ki + n and |t| = ko 
r ^ t(BH{s) 

(x\\p) <— s©G(r) where: 

|a;| = n and \p\ = fci 
If p = 0*^^ then a <— ® 

Else 2 <— T 
Return 2 



= n 



Note that the valid ciphertexts under RSA-OAEP are (uniformly) distributed in 
RngRSA(-^> e), which is Z’^. Under RSA-RAEP, valid ciphertexts take the form 
0||w where v G {Z^ n [1,2*“^]). The expected running time of this scheme is 
approximately twice that of RSA-OAEP (and k\ times more, in the worst case) . 
The ciphertext is longer by one bit. However, unlike RSA-OAEP, this scheme 
turns out to be IK-CCA secure. The (data-privacy) security of RSA-OAEP under 
CCA has already been established [18]. It is not hard to see that this result holds 
for RSA-RAEP as well. We omit the (simple) proof of this, noting only that the 
security (relative to RSA-OAEP) degrades roughly by the probability that after 
fci repetitions, the ciphertext was still not in the desired range (and consequently, 
the plaintext had to be sent in the clear). Given this, we turn to determining 
its security in the IK-CCA sense. We show that if the RSA family of trapdoor 
permutations is partial one-way then RSA-RAEP is anonymous. 

Theorem 3. If the RSA family of trapdoor permutations is partial one-way then 
n = RSA-RAEP is anonymous. Concretely, for any adversary A attacking the 
anonymity of II under a chosen-ciphertext attack, and making at most qoec 
decryption oracle queries, qgen G-oracle queries and (/hash H -oracle queries, there 
exists a 9-partial inverting adversary Ma for the RSA family, such that for any 
fc, fco(fc), fci(fc) and 9 = EzACA) ^ 




Key-Privacy in Public-Key Encryption 579 



Adv*^X(A:) < 32ghash • ((1 - £i) • (1 - £ 2 ) • (1 - £ 3 )) ^ • AdvRsP°Xfl“(^) + 

9ge„ • (1 - ea)-' • 

where 

1 

2fc/2-3 _ I ’ 

2(^gen Qdec “b 2(^genQ'dec 2(^(;l0c ‘^Qhash 

ond ^/le running time of M a is that of A plus qgen • 9hash • 0{k^). I 

The proof of the above is in the full version of this paper [2]. Note that for 
typical parameters ko{k),ki{k), and number of allowed queries ggem^hash and 
qdec, the values of ei,C 2 and eg are very small. This means that if there exists 
an adversary that is successful in breaking RSA-RAEP in the IK-CCA sense, 
then there exists a partial inverting adversary for the RSA family of trapdoor 
permutations that has a comparable advantage and running time. 

The 0-partial one-wayness of RSA has been shown to be equivalent to the 
one-wayness of RSA, for 0 > 0.5 [18]. In RSA-RAEP (as also in RSA-OAEP) this 
is usually the case. (In general, the equivalence holds if any constant fraction of 
the most significant bits of the pre-image can be recovered, but the reduction 
is proportionately weaker [18].) Using this and Theorem 3 we are able to prove 
the security of RSA-RAEP in the IK-CCA sense assuming RSA to be one-way. 
A theorem to this effect, with concrete bounds, can be found in the full version 
of this paper [2]. 

4.2 Encryption with Anonymous Trapdoor Permutations 

Given that the standard RSA family is not anonymous, we seek families that 
are. We describe some simple RSA-derived anonymous families. 

Construction 1 We define a family F = (K,S,E) as follows. The key ge- 
neration algorithm is the same as in the standard RSA family of Example 1. 
Let (N, e) be a public key and k the corresponding security parameter. We set 
DomiJ’(A, e) = Rng^(A, e) = {0,1}^. Viewing as a subset of {0,1}^ we 
define 

X® mod A if X e 
X otherwise 

for any x G {0, 1}^. This is a permutation on {0, 1}^. The sampling algorithm 
S on input N, e simply returns a random fc-bit string. It is easy to see that this 
family is trapdoor. | 

As we will see, the family F is perfectly anonymous. But it is not one-way. 
However, it is weakly one-way. (Meaning, for every polynomial-time adversary 
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B, there is a polynomial f3{-) such that < 1 — l/l3{k) for all 

sufficiently large k.) Thus, standard transformations of weak to strong one-way 
functions (cf. [19, Section 2.3]) can be applied. Most of these preserve anonymity. 
To be concrete, let us use one. 

Construction 2 Let F = (K,S,E) be obtained from F of Construction 1 
by Yao’s cross-product construction [34]. In detail, the key-generation algo- 
rithm is unchanged and for any key N, e we set Dom^(iV, e) = Rng^(A, e) = 
{0, 1}* . Parsing a point from this domain as a sequence of fc-bit strings we set 
EN,e{xi, . . . ,Xk) = {Epf^e(xi), . . . , E]^^e{xk))- The sampling algorithm is obvious 
and it is easy to see the family is trapdoor. | 

Proposition 1. The family E of Construction 2 is a perfectly anonymous fa- 
mily of trapdoor, one-way permutations, under the assumption that the standard 
RSA family is one-way. | 

The proof of one-wayness is a direct consequence of the known results on the 
security of the cross-product construction. (A proof of Yao’s result can be found 
in [19, Section 2.3].) The anonymity is easy to see. Regardless of the key, the 
adversary simply gets a random string of length k'^, and can have no advantage 
in determining the key based on it. 

The drawback of the construction is that the cross product construction is 
costly, increasing both the computational and the space requirements. There 
are alternative amplification methods that are better and in particular do not 
increase space requirements, but we know of none that do not increase the com- 
putational cost. 

Standard methods of trapdoor permutation based encryption yield anony- 
mous schemes provided the underlying trapdoor permutation is anonymous. This 
means any encryption method based on hardcore bits [21]. 

These methods lead to appreciable losses of concrete security, which is why 
we do not state concrete security versions of the results. 
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Abstract. A fair blind signature scheme allows the trustee to revoke 
blindness so that it provides authenticity and anonymity to honest users 
while preventing malicious users from abusing the anonymity to con- 
duct blackmail etc. Although plausible constructions that offer efficient 
tricks for anonymity revocation have been published, security, especially 
one-more unforgeability and revocability against adaptive and parallel 
attacks, has not been studied well. We point out a concrete vulnerability 
of some of the previous schemes and present an efficient fair blind sig- 
nature scheme with a security proof against most general attacks. Our 
scheme offers tight revocation where each signature and issuing session 
can be linked by the trustee. 



1 Introduction 

Fair blind signature schemes are a variant of blind signature schemes; they allow 
a trustee to revoke the blindness in such ways that 

— given a view of a signature issuing session conducted with an authenticated 
user, the trustee can identify the resulting signature (Signature Tracing), or 

— given a signature, the trustee can identify the issuing session that yielded 
the signature, which eventually identifies the user who conducted the session 
(Session Tracing). 

Such schemes will play an important role in applications that must offer both 
privacy and authenticity while preventing users from abusing anonymity. See 
[25] for a concrete example. 

The notion of fair blind signatures was introduced independently in [6,9] for 
the construction of anonymous electronic payment schemes. Since then, some 
efficient constructions have been shown [23,7] and several different approaches 
to the same goal have been taken [12,16]. These previous schemes provide efficient 
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revocation mechanisms but their security, especially in terms of revocability and 
unforgeability against adaptive and parallel attacks, has not been rigorously 
studied. Indeed, even the security of ordinary blind signatures against parallel 
attacks has been studied formally only in recent works [20,17,22,2,1]. 

In some schemes, revocation is limited to linking a signature to its owner. 
There are some other schemes that allow a signature to be linked to a particular 
issuing session. Such a fine revocation, for instance, allows one to know the 
issuing time of the target signature from the session log. Typically, revocation 
in this type of schemes reveals the randomness generated by the user during 
the issuing session. Accordingly, if a malicious user broadcasts a value via the 
Internet and encourages all other users to use it as the random parameter in 
issuing sessions, revocation becomes useless. Some known schemes, e.g. [16,7, 
15], are vulnerable against this attack, or they implicitly resort to on-the-fly 
freshness checking, which is expensive in practice. 

Our contribution is an efficient fair-blind signature scheme that is secure 
against adaptive and parallel attacks. Assuming the existence of ideal hash func- 
tions [5], its blindness is proven under the decision Diffie-Hellman assumption, 
and revocability and one-more unforgeability against adaptive and parallel at- 
tacks are proven under the discrete logarithm assumption. Another advantage of 
our scheme is that it offers tight revocation. That is, given a signature, revocation 
identifies the issuing session that uniquely produced the signature, and, given 
a session view, revocation identifies the unique signature created in the session. 
Naturally, once such tight revocability is achieved, the scheme also provides 
one-more unforgeability since tight and bi-directional revocability guarantees 
one-to-one mapping between issuing sessions and resulting signatures. 

The rest of this paper is organized as follows. Section 2 defines the security 
of fair blind signatures. Section 3 reviews underlying ideas and building blocks. 
Section 4 presents our scheme in detail. A security analysis is given in Section 5. 
Section 6 gives several remarks. It includes weakness of our scheme, modifica- 
tions, and open problems. 



2 Definitions 

Let {QstS,U,V) be a blind signature scheme where Qs is a signing key gen- 
eration algorithm, S and lA are interactive Turing machines called signer and 
user, and V is a signature verification algorithm. (Please refer to [17,22] for a 
formal functional definition of blind signature schemes.) Informally, a fair blind 
signature scheme with off-line trustee is a blind signature scheme with five addi- 
tional probabilistic polynomial-time algorithms, QT^ TAsig, TZsid, M^sig, and Msid 
as follows. 

Qt is a revocation key generation algorithm that takes a public key of a signer, 
say pk, and outputs a private and public revocation key pair. The keys can 
be independent of the public key of the signer (thus only one revocation key 
pair for all signers); (rsk,rpk) ^ ^^(l",^^). 
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TZsig is a revocation algorithm that generates signature identifier Igig that iden- 
tifies the signature yielded from the target session. It takes the view of the 



signer during the target session and revocation key; Ig 






sk). 



TZsid is a revocation algorithm that generates session identifier that identifies 
the session that has produced target signature-message pair Em- Isid <— 
^sidi^m-! Tsk). 

M-sig is a matching algorithm that examines whether I gig matches to signature- 
message pair Em or not. It outputs I if they match, 0 otherwise; O/I ^ 

sigi^k^ sigi ^m)- 

A4sid is a matching algorithm that examines whether Igid matches to viewi or 
not. It outputs I if they match, 0 otherwise; 0/1 ^ A4sid(Isid, vieWi). 



These algorithms also take public data such as pk and rpk if needed. Although 
vieWi include everything that the signer can see during the session, which includes 
his own private key, what is really necessary to complete revocation differs J^ss 
differ depending on the specific revocation mechanism used. 

We start the security definitions with traceability. Intuition states that a 
scheme is session traceable if no adversary can output a signature that can not 
be associated with the corresponding session, or can be associated with more 
than two sessions by revocation. Accordingly, it assures that each valid signature 
should be linked to a single session. Similarly, a scheme is signature traceable 
if no adversary can output two signatures that will be associated to the same 
session. Hence, it assures that every session should be linked to a single valid 
signature. If a scheme provides both types of traceability, shown below, we say 
that the scheme offers tight revocation. 



Definition 1. (Signature Traceability) A fair blind signature scheme is signa- 
ture traceable if, for any probabilistic polynomial-time algorithm U* that, after 
interacting with legitimate signer S at most £ times in an adaptive and arbitrarily 
interleaving manner, outputs 

— a valid signature-message pair, say Em, such that, for Igig = TZsig{vieWi, rsk), 
AAsig{Isigi ^ m) = 0 holds for all i = 1, . . . ,£, or 

— two valid and different signature-message pairs, say Emo, Emi, such that, 

there exists % %n such that AA sigi^Isig, AJmo^ — AAg-igi^Is-ig, Emi') — 1 

where Igtg = TZsig{vieWi,rsk), 

with probability at most l/n° for sufficiently large n and some constant c. The 
probability is taken over the coin flips ofQs, Gt, S, andU* . 



Definition 2. (Session Traceability) A fair blind signature scheme is session 
traceable if, for any probabilistic polynomial-time algorithm U* that, after inter- 
acting with legitimate signer S at most £ times in an adaptive and arbitrarily 
interleaving manner, outputs a valid signature-message pair Em such that 

- for Istd = TlsidiEim, rsk), Mstdihtd, vieWi) = 0 holds for all i = 1, . . . ,£, or 

— there exists i, j, i ^ j such that AAsid{Isid, vieWi) = AAsid{Isid, viewj) = 1, 
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with probability at most \/rf for sufficiently large n and some constant c. The 
probability is taken over the coin flips ofQs, Gt, S, andU*. 

Note that, in the random oracle model, these success probabilities also depend 
on the choice of random oracles. 

Next is blindness, which informally means that any adversary that colludes 
with the signer can distinguish two session views only with negligible advantage 
when one of the views results in a given signature. 

Definition 3. (Blindness) Let S* and V* be probabilistic poly-time algorithms 
that play the following game with honest user Uq and U\ . 

1. (pk,sk) ^ (rsk,rpk) ^ 

2. {msgo, msgi) ^ S*{sk,rpk) 

3. For b Gu {0, 1}, msgb is given to Uq, and msgi-b is given to U\. 

4- S* engages in the signature issuing protocol withU^, U\ in arbitrary order. 

5. Resulting signature Hq for msgo is given to T>* . T>* also allowed to take any 
information from S* . 

6. T>* outputs b' G {0, 1}. 

The signature scheme is blind if, for all polynomial-time S* and T>* , b' = b 
happens with probability at most 1/2 + 1/n'^ for sufficiently large n and some 
constant c. The probability is taken over the coin flips of Gt, Gs, S* , V* and 
Uo, U\ and b. 

Finally, we define one-more unforgeability in such a sense that it is infeasible 
to output ^ -\- 1 valid signatures after interacting with the signer £ times. 

Definition 4. (One-more unforgeability) A blind signature scheme is 1) 

unforgeable if, for any probabilistic polynomial-time algorithm U* , U* outputs 
£ -h 1 valid signatures with probability at most l/n° for sufficiently large n and 
some constant c after interacting with legitimate signer S at most £ times. The 
interaction can be done in an adaptive and arbitrarily interleaving manner. The 
probability is taken over the coin flips of G, S, and U* . 

It is important to see that if a scheme provides tight revocability, the scheme 
is one-more unforgeable since tight revocability assures that there is one-to-one 
correspondence between successful sessions and valid signatures. Accordingly, it 
suffice to prove blindness and tight revocability for our scheme. 

The above definitions are weak since the adversaries have no access to the 
trustee. Thus it is important for the trustee not to show the tracing information 
to anybody to prevent the adversaries from using the trustee as an oracle. When 
revocation is done only for private purposes such as criminal investigation, such 
weak definitions may suffice. Although our scheme provides security only in a 
weak sense, one can define a stronger notion of security by modifying the above 
definitions. Informally, the scheme provides strong signature/session traceability 
if traceability is retained even if the private revocation key rsk is given to lA* in 
Definition 1 and 2. Similarly, we say a scheme provides strong blindness if blind- 
ness is retained even if S* and T>* are allowed to ask the trustee for revocation 
except for the sessions and the signature in question. 
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3 Underlying Idea and Building Blocks 

3.1 Efficient Revocation Mechanism 

We take an approach similar to that introduced in [ 24 , 7 ]. Let Xt,yt{= be the 
revocation key pair. Let z be a part of the signer’s public key. To ask a signature, 
the user sends ,g'^) to the signer where 7 is a blinding factor that will be 
used later in blinding. The signer then blindly issues a signature bringing a 
pair ,yt) into the issuing protocol in such a way that a valid signature can 
be obtained only if the pair is blinded into (z,yt'^). The user can get a valid 
signature as he can do the conversion by taking the 7-th power. The signer is 
left blind since z is common to all signatures and (yt, g'^ ^yp) is assumed to be 
indistinguishable from (yt,g'^,yp ) with random 7' used for another signature. 

Given a signature that contains yP , the trustee can trace the session that 
contains g'* by computing (= g'^)- Similarly, given a session log that 

contains g'^ , the trustee can trace the resulting signature that must contain yP 
by computing y{)). 

For the above revocation mechanism to function, we must be sure that blind- 
ing by exponentiation, (z^^'*,yt) (z,y'J), is the only way to get a valid signa- 

ture. A blind signature scheme from [1] suits this purpose. As well as its security 
against adaptive and parallel attacks, one good property we can exploit is the 
restrictive blinding property. That is, when the signer issues a signature based on 
(z, zi) a user has to blind it into to have the signature correctly blinded. 

So if we set (z,zi) = {z^/'^ ,yt), it must be transformed into {z,yt'*). 

This trick, however, offers tight revocation only if all users are honest in 
choosing a unique 7 in each session. Our idea for tight revocation is to add extra 
randomness v to the blinding factor from the signer’s side so that yt^'" is involved 
in the signature. With this adaptation, the signer can randomize blinding factor 
7 chosen by the user into 7W so that it is unique in every session. 



3.2 Verifiable Encryption of DL 

For the reduction in our security proof to work, we need the trustee (simulator) 
to be able to extract not only yt'^ but also 7 itself. For this purpose, a user 
encrypts 7 with the public encryption key of the trustee and proves that 7 can 
certainly be recovered from the ciphertext. Generally speaking, an encryption 
scheme accompanied by a non-interactive proof that assures the receiver that the 
embedded plaintext satisfies some poly-time computable predicate is often called 
a verifiable encryption scheme. Goncrete examples can be seen in the literature, 
e.g. [ 4 , 3 , 8 ]. 

Let C = (z„,^) = (z^/''', (/'>') be a commitment of witness 7. Let {Qe,£,V) 
be a public-key encryption scheme. Let (ek,dk) ^ Ge{^^) and E ^ £efc(7; w) 
where w is a random tape. Let 7 ^ be a relation between C and E such that 



(C, E) log^^ z = logg ^ = T>dk(E) mod q. 
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Let {V, V) be a non-interactive zero-knowledge proof (argument) system for re- 
lation TZ such that P ^ V{C\ E, 7, w, ek) and 0/1 ^ V(C, E, P). We assume that 
it provides correctness, soundness, and computational zero-knowledge. Note that 
when it is zero-knowledge argument the soundness is conditionally achieved un- 
der some intractability assumptions. 

On top of this standard security, we need it to be simulatable in such a 
sense that, for C = there exists a poly-time simulator which, with- 

out being given 7 and dk, outputs {E, P) such that (C, E) ^ R and {E, P) is 
computationally indistinguishable from correct {E, P) that satisfies (C, E) G TZ 
and V{C,E,P) = 1. We say that a verifiable encryption scheme is secure and 
simulatable if it provides all these properties. Note that we only consider passive 
adversaries who have no access to the decryption oracle. When the encryption 
scheme is semantically secure against chosen plaintext attacks and the proof sys- 
tem is a public-coin honest verifier zero-knowledge proof made non-interactive 
with the Fiat-Shamir technique [11], simulatability is provided under the em- 
bedded assumption for the semantic security of the encryption scheme and the 
random oracle assumption. 

Appendix A and B show two examples of verifiable encryption that provide 
all of the security properties we need in our construction. These schemes have 
different flavors. The scheme in Appendix A is taken from [3] and is based on 
Okamoto-Uchiyama encryption [19] combined with the statistical zero-knowledge 
argument of [14]. In this scheme, it is assumed that the decryption key is not 
given to the adversary in order to assure soundness. Accordingly, if this scheme is 
integrated in our construction, one has to assume that the trustee and the users 
are not colluding. The second scheme in Appendix B is newly constructed based 
on ElGamal encryption and a log-round perfect zero-knowledge proof. Though 
its efficiency is worse than that of the first one, this scheme provides a stronger 
property in that soundness holds even if the decryption key of the trustee is 
given to the adversary. 



4 Our Scheme 

[Signing Key Generation] 

Let 5 be a probabilistic polynomial-time algorithm that generates a group 
parameter, (p, q, g, h) where p, q are primes and g, h are generators of subgroup 
of order q in Zp*. A signer selects three hash functions Hi : {0,1}* ^ (g), 
^2,3 : {0,1}* ^ {0,1}I'^I and generates public-key pk = {p,q,g,h,y,z) and 
private-key sk = (x) as follows; 

{P,q,9.h) ^ 0 ( 1 ”), 

X GlJ EZi q , 

y = g^ mod p, 

2: = Hi{p, q,g,h,y). 



All arithmetic operations are done in (g) hereafter unless otherwise noted. 
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[Revocation Key Generation] 

Given the public key of a signer, the trustee generates secret-key rsk = {xt, dk) 
and public-key rpk = {yt, ek) where Xt &u Z*, yt = and efc, dk are the key 
pair for verifiable encryption scheme described in Section 3.2. 

Depending on the encryption algorithm £ used for verifiable encryption, 
{ek, dk) can be common for all signers. Similarly, if {p,q,g) are common as 
system- wide parameters, Xt,yt can be common, too. 

[Signature Generation] 

Here, we describe the signature issuing protocol in a higher level. Details can 
be found in Figure 1. 

1. The user chooses blinding factor 7 and computes Zu = z^/^ and ^ = g^ ■ 
He then executes verifiable encryption where 7 is encrypted into E and the 
relation among Zu,^,E is proven by providing P. 

2. The signer verifies {E, P). He generates v randomly, and computes Zi = yt'" 
and Z 2 = Zujz\. He then proves to the user that zi is made as it should 
be by providing Schnorr zero-knowledge proof Pg = (crg,Cs) where Cg = 

T-Lz{zi\\yt"“) and Og = Vg — CsU mod g for Vg Gu ^q- The proof will be 

? 

verified by the user as Cg = H 3 {zi\\yt'^‘‘ zi"‘). 

3. Based on y,z\,Z 2 , the signer and the user engages in an interactive proof 
protocol. For the signer, the protocol is a witness indistinguishable proof of 
knowledge of 

logg y V (logg Zi A \ogt,{zu/zi)). 

The signer converts the proof into the one for 

logg y V (logg Cl A log;,(z/Ci)) 

by exponentiating {z\,Zu) ^ blinding it with the standard di- 

version technique [18]. The converted proof is eventually transformed to a 
signature with Fiat-Shamir technique. 

4. The signer stores C“ as the identity of this session. 

5. The user outputs a signature, E = (Ci, p, w, CTi, (J 2 , <5) for message m. 

Note that can be published, though it is not necessary to the user. The signer 
may provide extra Schnorr zero-knowledge proof that proves log^(C’') = logg^ Z\. 

[Verification] 

A signature-message pair, (A, m), is valid if it satisfies 

+ <5 = n 2 {({iUy^U^(:A\h^^{zKiY\\m) mod q. ( 1 ) 



[Revocation] 

Signature Tracing: Given valid {zu,^, E, P) and the trustee computes 
Isig = {^")^*- Observe that 

= = 5 '^""* = 2/t^" = Ci- (2) 

Thus, Igig identifies the resulting signature. 




7 fey 



{zu, 0 ,E,P 



V{{zu,^),E,P,ek)ll 
V £u 

zi = yt'’, Z2 = Zujzi 
Ps = Ps{zi,yt,v) 

u,si,S 2 ,d £jj TZq 
a = g'^ 

61 

62 = h!‘‘^Z 2 '^ 



c = e — d mod q 
r = u — cx mod q 



Ps,zi,a,bi,b 2 



r,c,si,S2,d 



E = £ek(j;uj) 

P = P{{zu,i),y,uj, ek) 



zi,bi,b2 € (g) 

Vs{zi,yt, Ps) = 1 

Cl = zi'', C2 = z/Ci 

tl,t2, ts, t4, is &U Zq 

a — ag^^y^^ 

/ 3 i = & 7 s*^C^ 

e = H2(Ci||«ll/3i||/32|lm) 
e = e — i2 — is mod q 



),c,ai,a2,» ^ p ^ ^ _l_ g 

zu = c + t2 mod q 
CTi = 7S1 + is mod q 
(72 = 7S2 + i4 mod q 
(5 = d + is mod q 

117 + 5 I H2(Ci||<7'’ni5"^Ci"l|/i"Xlllm) 



(Cl, p, zu,(Ji,a 2 , 5 ) 



1 . The signature issuing protocol. The session aborts if any of the chi 
£, P are from the underlying verifiable encryption scheme of Section 3 
)chnorr-type proof of knowledge of v w.r.t yt and zi. The trustee is o 
ivolved in the issuing protocol. 
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X ! 

Session Tracing: Given a valid signature, the trustee computes Iss = Ci • 
Observe that 



Iss = = 9^^ = C- (3) 

Since is stored or published by the signer, identifies the session that 
issued the signature. 



5 Security Proofs 



5.1 Correctness 



Theorem 1. If the signer and the user follow the issuing protocol, the protocol 
completes with a valid signature with probability 1. 

Proof. There are four verifications denoted by = in the issuing protocol. The 
verification for P and Ps in each side will accept the proof with probability 1 
due to the correctness of these proof systems. It is clear that zi, &i, &2 are in (g). 
For the last one, which is equivalent to the verification predicate, observe that 
the following holds. 



Vj + d = c + t2 + d + t5 = e + t2 + t5 = s (mod q) 
gpy^ = = 05*1 = a 

= Pi 



Thus, the protocol always stops with a valid signature if both parties follow the 
protocol. □ 



5.2 Blindness 

Theorem 2. The proposed scheme is blind if all hash functions are random or- 
acles, the decision Diffie- Heilman problem is intractable, and the underlying ver- 
ifiable encryption scheme is secure and simulatable in the random oracle model. 

Proof. Suppose that {S*,T>*) is successful in breaking blindness with probability 
1 /2+e where e is not negligible. We show that S* and T>* can be used to solve the 
DDH problem. Define DH = {{Xi,X2,X3) G (5)^! log^ logg X2 = loggia} 
and RND = {{Xi, X2, X3) G (5)^}. Let (A,B,C) G (5)^ be a DDH instance, 
i.e., taken from DH or RND with equal probability. Let {A, B, C) = {g^, g^, 5°). 
If any of a, b, c is zero, we can immediately determine whether the instance is 
in DH or not. So we assume that none of them are zero hereafter. 

Simulation proceeds as follows. We simulate hash function so that it 
outputs by selecting ru Gt/ for each fresh query. Suppose that ri is 
selected for z = TLi{p, q, ...) = B’'i. Next choose r2 G„ and set the revocation 
public key as yt = Select d Gc/ {0, 1} and execute the issuing protocol with 
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S* twice. Label the executions runo and rurii. In runi_d, we simply follow the 
protocol. In rund, we first set and ^ = B. Observe that z,Zu, and ^ 

are perfectly simulated no matter whether (A,B,C) is from DH or RND since 
z,Zu,^ satisfies log^^ 2 ; = log^^ = (loggB). We then simulate E by encrypting 
T 3 €[/ Z*. Since yf logg B in general, ((z„, ^),E) ^ TZ. However, the simulator 
can produce P in such a way that (E, P) is computationally indistinguishable 
from the real ones since we assume that the underlying verifiable encryption is 
simulatable in such sense. Now send Zu,^, E, P and receive Ps, 21 and etc from 
S* . At this point, we rewind S* to extract v from Pg by applying the Forking 
Lemma [21]. We then continue and complete the issuing session. 

For message mg is given by S* at the beginning, the simulator generates a 
signature-message pair, say ( 21 , 010 ), with regard to Other variables 

except for in E are generated by using the standard zero knowledge simulation 
technique; randomly choose p, w, cti, (J2, 5 , and then freely define Ti,2 so that they 
look consistent. Given (if, m) and views from S* , distinguisher T>* outputs d'. 
If d' = d, we conclude that {A, B, C) is in DH. It is in RND, otherwise. 

We now claim that if {A, B, C) G DH, 27 is a valid signature that could have 
been produced in rurid. Observe that, for z,zi used in rurih. 

So if ab = c, we have a consistent blinding factor, 7 = b which satisfies 7 = 
log^^ z = log Cl for Zu and Zi used in runa. Furthermore, there are blinding 
factors G, O 7 ^ 3 ; fs that convert the view of rurid into the remaining elements 
in 27. On the other hand, 27 could have been produced by rurii_d only with 
negligible probability as should differ in rurii_d. Accordingly, given 27, D* 

outputs d' = d with probability 1/2 -b e. 

Next, we claim that if {A, B, C) G RND, 27 is statistically independent of 
the views of the signer in rung and rutii since log^^ z yf log^^ Ci holds for (z„, zi) 
in both runs except with negligible probability. Hence, d is also statistically 
independent of the view of the signer, and d' = d happens with probability close 
to 1/2 except for a negligible fraction. 

In total, the success probability is l/2(l/2-|- e) -b l/2(l/2) = 1/2-b e/2, which 
contradicts the DDH assumption when e is not negligible. 

□ 



5.3 Tight Revocability 

Theorem 3. The proposed scheme is session traceable if all hash functions are 
random oracles, the discrete logarithm is intractable, and the soundness condition 
of the underlying verifiable encryption scheme holds. 

Proof. Here we must show two properties. We first show that it is infeasible for a 
user to produce a signature 27* = (Ci, P, zu, cti, CT 2 , S) such that log^^ z yf log^^ Ci 
for all (z„,zi) used in issuing sessions. We then show that a valid signature 
cannot be linked to more than one session. 
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Assume that having at most qh accesses to H2 and asking at most £ sig- 
natures to S, Uq outputs signature S* = p,zu,ai,a2,S) that satisfies 
log^^ z ^ log^^ Cl for (zu,Zi) used in any session. Here, q/i and £ are bound 
by a polynomial of security parameter n. Let cq be the success probability of 
Uq, which is not negligible in n. We randomly fix an index Q G { 1 , . . . , and 
regard Uq as successful only if the resulting signature corresponds to the Q-th 
query to 7^2 • (If it does not correspond to any query, Uq is successful only with 
negligible probability due to the randomness of 712 ■) Accordingly, it is equiva- 
lent to assuming an adversary, say Ul, that asks 7^2 only once and succeeds with 
probability ei > eo/g/i- By using U{, we construct machine that solves the 
discrete-log problem. Let (p, q, g, Y) be an instance of the discrete-log problem 
to solve X = logg Y in TZ^^. 

Reduction Algorithm: Adi first sets (p, q, g) (p, q, g). It also generates key 
pair (dk, ek) for the underlying verifiable encryption scheme. It then flips a coin 
X G(7 { 0 , 1 } to select either y :=Y (case x = 0 ) , or h :=Y (case x = I)- 

Case X = 0 - 

Intuition: We set y = Y and attempt to extract the y-side witness by sim- 
ulating the signing oracle with z-side witness, which is logg zi and log^ Z2. We 
run U* twice with a different answer from U2 and apply the Forking Lemma. 
It should cause a change of either 5 or tu in the resulting signatures. If we are 
lucky, we have different zu’s and can extract the y-side witness. 

1 . Adi sets y = Y. 

2 . Adi selects w,wq,wi Gu and sets h := g^, z := 7 fi(p|jg||y||y) = 
and yt = g'^^ ■ 

3 . Adi runs Ui and simulates S for i-th query in the following way. 

a) Given Ei, Pi) from the user, check Pi and reject if incorrect. 

Otherwise, decrypt Ei ^ 7^. 

b) Compute Oi := g^'y‘^' for Ci,Ti €u TZq. 

c) Compute wu = w\Vi mod q and W2i = (wo/xi — wu)/w mod q for Vi Gu 
Z*. Then set zu = and Z2i = 

d) Compute Psi by using legitimate witness Vi. 

e) Compute bu := and &2i := with uu,U2i Gu EZq. 

f) Send Psi, ai, bu, &2i to U^. 

g) Given Ci from Ui, compute di := Ci — Ci mod q, su := uu — diWu mod q, 
and S2i := U2i — diW2i mod q. 

h) Send r*,Ci,sii,S2i,di to Tdi*. 

Adi simulates U2 by returning e Gu EZq. 

4 . W£ outputs a signature, say (C,\, p,w ,u\,a2,S), that corresponds to e. 

5 . Reset and restart U\ with the same setting. Adi simulates U2 with e' Gu EZq. 
In this second run, Adi also uses the same random tape. 

6. Ul outputs a signature, say {C,\, p' ,w' ,a'i,U2,6'), that corresponds to s' . 

7. If zu ^ zu', Adi outputs X := {p — p')j(w' — zu) mod q. The simulation fails, 
otherwise. 




594 



M. Abe and M. Ohkubo 



Case X = 1- 

Intuition: We set h = Y , z = with random wi,W2, and attempt 

to extract different representation of z, that leads log^ h. The signing oracle is 
simulated with y-side witness except for one query. For the one randomly chosen 
J-th query, we use y-side witness and z-side witness, i.e., (wi,W2), together. 
We rewind Ui to apply the Forking Lemma. But this time, we fork the process 
by changing d in the </-th issuing session, which is used as a challenge to the 
z-side proof. We can answer to two different d’s in the J-th session since the 
z-side witness in this session is (wi,W2). Now if S is sensitive to the change of d, 
we have different i5’s and can extract the z-side witness which is different from 
(WI,W 2 ). 

1. M.\ sets h = Y. 

2. Ml selects x &u and sets y := .It also selects wi,W2 &u and sets 
z:= ni{p\\q\\g\\y) = 

3. Ml selects J £u , 1 }. It also selects vj and set yt = 

4. All runs til and simulates the signing oracle for the z-th query in the fol- 
lowing way. 

a) For z yf J, Mi follows the protocol with y-side witness, x. H2 is simulated 
by returning random choices from {g). 

b) For i = J, Ml engages in the issuing protocol using x and (101,102) as 
follows. 

i. Given (zui, ^i, Ei, Pi) from the user, check Pi and reject if incorrect. 
Otherwise, decrypt Ei 7^. 

ii. Set zij = z/t"-'. (Accordingly, Zij = and Z2j = 

hi. Compute aj = g"-', hij = g^^-^ , &2J = with uj, uij, U2j &u ^q- 
iv. Send (vj,aj,bu,b2j) toU*. 

V. Given ej from U^, choose dj Gjj Zig and compute cj := ej — 
dj mod q, rj := uj — cjx mod q, Sij := Uij — djWi mod q, and 
S2J := U2J - djW2 mod q. 
vi. Send (rj,cj,su,S2j,dj) toUl- 
Ml simulates 7^2 by returning e Gu Zg. 

5. U'l outputs a signature, say (Qi, p,w ,01,02, d), that corresponds to e. 

6. Rewind and restart U\ with the same setting. Then choose I Gu { 0 , .. . ,£}. 

— If / = 0, Adi simulates H2 by returning e' Gu Zg. Otherwise, set s' = e. 
— If / yf 0 and run,/ have not yet been completed before the query to 
7^2 is sent, Adi simulates the execution by using both y-side and z-side 
witnesses as above choosing d'j Gu Zg. Otherwise, Adi simulates only 
with y-side witness choosing dj = dj. 

7. outputs a signature, say {(i, p' ,zu' ,o'i,o'2,S'), that corresponds to s'. 

8. If 5 = 6 ', simulation fails. Otherwise, Adi computes zcj = (oi — o'i)/(S' — 
5) mod q, w'2 = (02 — 02 ) /(5 — <5') mod q, and outputs X = (zci — w'i)/(w'2 — 
W2) mod q. 

Sketch of success probability evaluation: 

Suppose that all random variables chosen by the simulating signer are determined 
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purely from the random tape so that they are fixed before the simulation starts. 
We consider how 6 in S* is sensitive to the alteration of £ and . . . , 

which are given after e is given to Ui . Observe that independent variables given 
to Ul are p, q, g, h, y, Tii, H 2 , sid^, Oj, bu, &20 di for all i, and e and the random 
tape of Ui- All other variables are uniquely determined by these independent 
variables and outputs of U*. We wrap all these independent variables into A, 
except for . . . ,dif}, which is denoted by Dg hereafter. Let D denote 

De\{s}. 

Let S be the set of all {A, D^) that leads to a success, i.e., Pr^.u^ [(A, D^) e 
S'] > d. According to the Splitting Lemma [11,22], with probability at least 
£1/2, randomly selected A satisfies Pr^^ [(A, Dg) G S] > ei/2. Once A is fixed, d 
is uniquely determined by D^. By 5 ^ Dg, we denote the map from (A, Dg) G S 
to S. If (A, Dg) ^ S, we denote _L ^ Dg. 

Define function ip as 

HS) = Pr[<5 ^ Dg]. 

Let Smax be the value of S that maximizes ipid). That is, Smax is the value 
of S that is most likely to appear in L7*. Let ipmax = i){dmax)- We consider two 
cases. 

Case 1 {'tpmax is not negligible) : 

In this case, for randomly chosen Dg and D', the adversary is likely to out- 
put signatures that contain 5max with sufficiently large probability. When 5 is 
the same for different e from 7^2, ro must differ as, 5 + vj = e. Consequently, 
with sufficient probability, we obtain w ^ w' with which y-side witness can be 
extracted as written in Step-7 of Case x = 0- Por more details, we refer to the 
proof of Lemma 3 of [1]. 

Case 2 {'t\;„iax is negligible) : 

In this case, 8 tends to change if Dg is altered. Due to [1], randomly chosen 
Dg and Dg that differ only at one position lead Ui to output two correspond- 
ing signatures (Ci, P, ci7, ^i, (72, S) and (Ci, with sufficiently large 

probability. From these signatures, we can extract w'l , w '2 that satisfy Ci = 
and C/Ci = By assumption, log^,^ z yf log^^ Ci- So w\ yf w'l and W 2 yf w'2 
holds. Accordingly X = log^ h = (wi — w'-^)/{w '2 — W 2 ) mod q is computable. 

The probability distribution over these cases depends on A and the strategy 
of Ui . Note that the distribution of A does not depend on the choice of \ as the 
protocol is witness indistinguishable and the public key is generated so that it 
distributes uniformly. Accordingly, the coin flip of x turns the simulation to the 
proper case with probability 1/2. 

In the above, we proved that for Ci in a valid signature, there exists at least 
one session that includes (z„, z\) that satisfies log^^ 2; = log^^ Ci- Since z\{= Zu") 
depends on random v chosen by the honest signer and is in {g) when P is 
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valid, z\ is unique among all sessions with overwhelming probability if only 
polynomially many sessions are executed. 

We also need to prove that a signature cannot be produced without interact- 
ing with the legitimate signer. This can be done by a standard argument that 
uses the Forking Lemma and so is omitted here. 

Finally, we need to show that a session that includes target (z„,zi) can be 
identified from ' . For this, observe that the rightmost equality in Equation 3 

holds because ^ for 7 = log^^ z = log^^ Ci with overwhelming probability 
due to the soundness of P. □ 

Theorem 4 . The proposed scheme is signature traceable if all hash functions are 
random oracles, the discrete logarithm is intractable, and the soundness condition 
of the verifiable encryption scheme holds. 

Proof. We need to show that no adversary can generate a signature containing 
Cl such that Cl 7^ for &ny (C*') stored by the signer. This can be done in 

the same way as done in the proof of Theorem 3. 

In the following, we show that it is infeasible for the user to output two valid 
signatures that contain the same Ci regardless of the user’s behaviour. 

The proof is done by contradiction. Suppose that there exists an adversary 
that outputs two valid signatures that result in the same session by revocation 
with success probability €2. Here, £2 is not negligible in n and is allowed to 
interact with S at most £ times in an arbitrary fashion. Let £ > 1. {£ = 0 was 
considered in Theorem 3.) 

Now there exist two queries to 7^2 that correspond to those two signatures. 
In a similar way as used in the proof of Theorem 3, we guess the indexes of 
these queries and regard U2 being successful only if the guess is correct. 
Accordingly, this is equivalent to an adversary, say that asks H2 only twice 
and succeeds with probability £3 = £2/ (‘^2^) in producing two signatures in the 
expected relation. 

We construct a machine M 2 that, given (p, q, g,Y), solves X = log^Y in 
Zq by using . 

Reduction algorithm: 

1. M 2 sets (p,q,g) := (p,q,g). 

2. M2 sets either y = Y or y = g^ for x Gu by hipping coin y. 

3. M2 selects w,wo,wi Gu TZjq and sets h := g"^ and z := g^°, yt := 

4. M 2 selects / G[/ {1, . . . ,£}. 

5. M 2 runs simulating S as follows. 

— For rutii {i I), M2 simulates with z-side witness in the same way as 
shown in Step-3 of Case y = 0 in the proof of Theorem 3. 

~ For run/, 

• if y = Y , M 2 simulates with z-side witness as above, otherwise 

• it sets Zi/ = Y and simulate Pg in the standard way by setting TLz 
conveniently. Then follow the rest of the protocol using x. Save 7/ 
by decrypting Ej. 

M2 simulates TL2 by returning random values, say e\ and £2. 
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6. outputs two signatures. 

7. At 2 rewinds and restarts with the same setting. It selects J G(7 {1,2} 
and answers to J-th query to Ti ,2 with e'j Gj/ Zq. 

8. outputs two signatures. 

9. Let (Cl, p, CC7, (Ji, (72, iJ) and (Ci, p^ w', cr}, cr^, '^0 b® Ibe resulting signatures 
that correspond to ej and e'j respectively. (If any of the resulting signatures 
does not correspond to the hash value, AI 2 fails.) If y = 0 and zu yf zu', M 2 
outputs logg y = logg Y = {p — p')/ (zu' — zu) mod g. If y = I and S yf S', it 
outputs logg Zu = logg Y = (cti — <7'i)/^i(S — (5') mod q. M 2 fails, otherwise. 

We omit the evaluation of success probability as it can be done in the same way 
as shown in the proof of Theorem 3 of [1]. □ 

Due to Theorem 4 and 3, the mapping between each session and valid sig- 
nature is bijective with overwhelming probability. Accordingly, we have the fol- 
lowing corollary. 

Corollary 1. The proposed scheme is (£,i + l)-unforgeable for polynomially 
hound I if the discrete logarithm is intractable, all hash functions are random 
oracles, and the verifiable encryption is secure and simulatable. 

6 Remarks and Open Problems 

— When each user uses a unique (zu,£,,E,P) repeatedly in all issuing sessions, 
i.e. as a public-key of the user, the scheme provides blindness (and unlinka- 
bility) in a weak sense. That is, signatures are computationally independent 
of each other unless the signer cooperates with the attacker. Such low-level 
privacy may be acceptable in applications as it offers less computation and 
communication complexity instead. 

— As briefly mentioned in Section 2, the security definitions and the proofs 
confirm the security under the assumption that the trustee will never be 
abused as an oracle. Accordingly, the trustee must not show the tracing in- 
formation to anybody. To provide stronger security in blindness where the 
trustee can publish the tracing information, we need the following proper- 
ties. First, the verifiable encryption must be non-malleable against adaptive 
chosen message attacks. It also has to provide public verifiability. Second, 
the signature scheme must be unforgeable even for the signer in such a sense 
that for target signature S produced from a session identified by ff the 
signer should not be able to produce valid signature A'(yf S) that results 
in tracing information that is relative to . This property is not achieved 
in our construction even if we restrict S' to be different from S in the part 
necessary for revocation, which is in our case. A particular attack on the 
strong blindness is as follows. The signer transforms Ci in challenge signa- 
ture S into f' = with random a and creates signature S' that includes 
f' by using real signing key x. Session tracing information computed from 
C' will be (^^)“ and the signer can obtain target session identifier f". This 
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particular attack can be prevented but we leave a provably secure solution 
for this issue an open problem. 

— It is important to point out that, since the trustee can recover 7 from E, 
he can produce signature S' that results in the same tracing information 
linked from signature S legitimately produced by the user. Such a threat can 
be eliminated by encrypting 7 with a encryption key whose decryption key is 
not known to anybody. (Remember that the decryption-key is not necessary 
for the trustee to complete revocation.) But for the sake of security proof, the 
simulator must be able to decrypt it. This is possible, for instance, with the 
verifiable encryption scheme in Appendix B. By generating encryption key 
y as y = 7f(str) where str is a fixed public string and Ti. is a hash function 
H : {0,1}* ^ (g). In this way, any party can be convinced that no one 
knows the decryption key corresponding to y, but a simulator that simulates 
the hash function as a random oracle in the proof of revocability can assign 
arbitrary as Ti(str) so that x is known only to the simulator. 

— Since revocation only identifies a specific randomness appearing in a issuing 
session, it would be necessary to assure that the session is really done by the 
user. An easy solution would be to have the transcript signed by the user. 
Although the signer may flame the user by creating S' from S so that they 
result in the same session tracing information in the similar way shown in 
the second remark, one can see that it is not the user who created the second 
signature due to Theorem 3. 
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Appendix A 

The following verifiable encryption scheme is taken from [13]. Let (n,g, h,£g) be 
the public key and (p, q) be the secret key of the Okamoto-Uchiyama encryption 
scheme. Here, n = p^q, and g is in that satisfies ord(g mod p^) = p(p— 1), h 
= ho" mod n for randomly chosen ho G ^n, and £g is the bit length of the order 
of g. We assume that £g > 2£g where £g is the bit length of g. Let TLi : {0, 1}* — > 
{0, be a hash function. 

Now, 7 is encrypted by Okamoto-Uchiyama encryption as E = g'’'h‘“ mod n 
where Gu TZn- For C = (z„, g^'), (C, E) GlZis proven by providing 

P = (Cu, siu, S 2 u) computed by the prover as follows. 

1. Choose fci Gu {0, and ^2 Gu {0, 

2. Compute = TLi{zu, E , mod p,yt^^ modp, g^^h''^ mod n). 

3. Compute si = fci — c„7 and S 2 = k 2 — in Z . 

Here is a security parameter larger than 1 . P is valid if it satisfies 

Cu G {0,1}^U 
Siu G {0,1}"'’^®, and 

Cu = Tii{Zu,£,,E,Zu^'"g‘''’- modp,?/t„^^“C"“ mod p, mod n). 

The above protocol is a statistical zero-knowledge argument for relation R. 
Soundness is due to the strong RSA assumption over n. The detailed security 
proof can be found in [13]. 

Appendix B 

In this section, we require that p = 2q+ 1 and q = 2s-|- 1 for prime s. (See [26] for 
generating such Cunningham Chains.) Let h be a generator of a prime subgroup 
in ZZq where ord(h) = s. Let (x, y) G x (h) be a key pair of ElGamal 
encryption defined over (h). That is, y = h^' mod q. 

For 7 G 2Zq and C = {zu,0 = (E,P) is computed as follows. We 

first transform 7 into 7* G (h) by 



7 * = Jq{j) ■ 7 mod q. 
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Here, Jq{j) is the Jacobi symbol, (^). 7* is then encrypted into E = (Ci,C2) 
using ElGamal encryption as 



Cl = 7* • y“ mod q, 

C2 = h“ mod q, 

where ui Gu TZg. When E is decrypted into 7* and mod p 7 is obtained 
by 7 = —1 • 7* mod q. Otherwise, 7 = 7*. 

The proof is done in two steps. In the first step, the prover proves relation 
log^^ z = logg^ by the Chaum-Pedersen protocol [10]. In the second step, we 
prove in zero-knowledge manner that T>{E) = Jg(logg^) • logg ^ mod q by re- 
peating the following protocol sufficiently many times. 

1. The prover selects a Gu ZZ*^ and b Gu ZZg and sends 

To = mod p, 

Ti = Cl ■ Jq{a) ■ a ■ 'Z’ mod q, and 
T2 = C2 ■ h*' mod q 



to the verifier. 

2. The verifier sends cG(7{0,l}to the prover. 

3. The prover sends (a,/ 3) where {a, ( 3 ) = (a,b) when c = 0, and {a, ( 3 ) = 
(aj mod q,b + to mod q) when c = 1. 

4. The verifier accepts if, for c = 0, 

To = C mod p, 

Ti = C\ ■ Jq{a) ■ a-y^ mod g, and 
T2 = C2 ■ mod q, 



and for c = 1, 



To = mod p, 

T\ = Jq{a) ■ a ■ mod g, and 
T2 = mod g. 

It is not hard to see that the above is correct, sound, and perfectly zero-knowledge 
for any verifier. As usual, this method can be made non-interactive by executing 
all repetitions in parallel and creating the challenge c by hashing all data before 
the second step. 
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